Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make conditional discard flags available externally #3

Open
wants to merge 235 commits into
base: amd-gfx-gpuopen-dev
Choose a base branch
from

Conversation

kuhar
Copy link

@kuhar kuhar commented Jan 26, 2022

This is so that we can set/access these flags in other .cpp files.

dstutt and others added 30 commits October 10, 2019 14:25
This pass attempts to merge s_buffer_load_dword instructions into larger
sizes.
Constraints are that the resource descriptor, glc flag and size are all the same
for contiguous load instructions.

TODO: it should be possible to add different instructions to this as well such
as s_buffer_store and the vector variants buffer_load/buffer_store

V2: Fixes to sbuff-merge test
V3: Fixed smrd test for smem merging
V4: Fixed for gfx10 changes

Change-Id: Ie98107ad3b27b3c8c2da7afc3d129336f924895a
Summary:
Added intrinsics for the instructions:
- buffer_load_ubyte
- buffer_load_ushort
- buffer_store_byte
- buffer_store_short

Added test cases to existing buffer load/store tests.

Now that upstream supports byte/short overloads of the normal load/store
intrinsics, this change is only needed until LLPC stops using the
intrinsics added here.

Change-Id: I1b8a0910c508c9520a84b74a14d0aea8293cef38
Vulkan exposed an issue with this for a case with v_mad_mixlo_f16 where the
upper 16 bits were not cleared.

Modifying this to clear the bits instead of just copying fixed the problem.

V2: Fixed up "Fix issue for zext of f16 to i32"
V3: Fixed fcanonicalize-elimination test

Change-Id: I6128deaf8ebc3489fdd3e0caead837410bb47160
Readlane should have a uniform index - but in some cases it can be non-uniform.
The original implementation of readlane defaulted to using a readfirstlane of
the vgpr index on the assumption that the index would be uniform across all
lanes.

However, there are some cases where we might want a readlane like operation that
is lowered to a waterfall that performs a readlane for each distinct index value
across lanes.

Essentially we form a loop that uses readfirstlane to get the first enabled
lane's index - enable all lanes with the same index, perform a readlane and put
the result into the vgpr result register, then disable those lanes and repeat.

The pathological case will form a loop of 64 - but in most cases it will be
fewer.
In the case where the index IS uniform across all lanes, it will loop once,
which is slightly more expensive than the original code, but not a lot more.

V2: Initialize accumulator register in waterfall code
V3: Fixed for gfx10 changes

Change-Id: Ic52dfb14b053db7713b61c85e8ac766991aecdb8
Even though writelane doesn't have the same constraints as other valu
instructions it still can't violate the >1 SGPR operand constraint

Due to later register propagation (e.g. fixing up vgpr operands via
readfirstlane) changing writelane to only have a single SGPR is tricky.

This implementation puts a new check after SIFixSGPRCopies that prevents
multiple SGPRs being used in any writelane instructions.

The algorithm used is to check for trivial copy prop of constants into one of
the SGPR operands and perform that if possible. If this isn't possible put an
explicit copy of Src1 SGPR into M0 and use that instead (this is allowable for
writelane as the constraint is for SGPR read-port and not constant-bus access).

V2: Fix-up cases where writelane has 2 SGPR operands - bug fixes

Update to previous commit to fix an issue where immediates are tested for copy
propagation but aren't suitable as inline constants. The old method caused
issues that resulted in seg faults in later cse phases.

Even though writelane doesn't have the same constraints as other valu
instructions it still can't violate the >1 SGPR operand constraint

Due to later register propagation (e.g. fixing up vgpr operands via
readfirstlane) changing writelane to only have a single SGPR is tricky.

This implementation puts a new check after SIFixSGPRCopies that prevents
multiple SGPRs being used in any writelane instructions.

The algorithm used is to check for trivial copy prop of constants into one of
the SGPR operands and perform that if possible. If this isn't possible put an
explicit copy of Src1 SGPR into M0 and use that instead (this is allowable for
writelane as the constraint is for SGPR read-port and not constant-bus access).

Change-Id: Ic3045df22db738ca6572af00138520fd28563990
Implements 3 new intrinsics for implementing waterfall code for regions of code.

Waterfall is implemented as a loop that iterates over an index of values for
active lanes. For each iteration an index is picked (the first active lane) and
all lanes with the same index are left active (the rest are disabled).
The body of the code is then executed and the result accumulated into a result
vector register.
The active lane is then disabled and the next index chosen for the next
iteration.

The worst case for waterfall is one iteration per lane - but it is usually far
less than this.

The implementation uses 3 intrinsics to mark a region:
  - llvm.amdgcn.waterfall.begin
  - llvm.amdgcn.waterfall.readfirstlane
  - llvm.amdgcn.waterfall.end

The group must contain one begin and at least one readfirstlane and end
intrinsic - if the readfirstlane uses the begin index, then there will only be a
single readfirstlane and not 2.

The readfirstlane and end intrinsics are not limited to single dword values.

The waterfall loop will enclose all instructions from the begin to the final end
intrinsic.

See the test case for specific examples of use.

V2:
[AMDGPU] Updates to waterfall intrinsic support

Some fixes for waterfall support.
1. Intrinsics are better defined so enforce type correctness
2. Creation of the waterfall loop requires removing kill flag for operands moved
into the loop, this can otherwise cause an assert during verification
3. Fixed issue causing a tablegen failure due to float return type being used
for a scalar return value (incorrectly). The matching for intrinsic to pseudo
instruction now uses 2 template parameter types to disambiguate when using float
src types.
4. Ensure that any analyses are invalidated due to insertion of new basic
blocks. This should have been the default anyway.
5. Updated test in light of these changes. These are good exemplars for
implementors to check against.

V3:
[AMDGCN] Fixed waterfall intrinsic groups for multiple groups per BB

For cases with more than one waterfall group per basic block, the implementation
would go wrong after creating the first waterfall loop (incorrectly tracking
current BB).

Also added a test case.

V4:
[AMDGCN] Waterfall enhancements

Extra waterfall test for multiple readfirstlane intrinsics. Make sure that the
waterfall intrinsic clauses support multiple readfirstlane intrinsics.

Extended support for begin to allow multi-dword indices for the waterfall
loop. This allows the use of multiple indexes (and multiple non-uniform values)
in the same waterfall loop by combining the individual indices into a single
multi-dword index.
This has the same worst-case of 64 iterations (wave size) as before.

Added a new test case to demonstrate using multi-dword indices.

Added support for i16/f16

Changed the implementation to prevent CSE from merging clauses (by tagging the
intrinsics as having side effects and enhancing uniform detection to work in
this case as well).
Left in some support added to partially support EarlyCSE merging as it is still
valid, but probably won't be triggered much now merging is disabled.

Added support for a last.use intrinsic. See the test for an example of
this. This works in the same way as the end intrinsic, but you tag a last_use
instead (so it includes the next use in the waterfall loop). More than one use
is acceptable as long as it is in the same BB. All instructions up to the last
last.use will be included in the waterfall loop.

An important rule for waterfall clauses (that isn't detected if violated) is
that you can't have something inside the loop that is dependent on an end
intrinsic in the loop - this restriction is logically consistent - you have to use
two loops instead.

V5:
[AMDGPU] Handle unexpected uniform inputs in waterfall intrinsics

Waterfall intrinsic input operands can sometimes be uniform. The waterfall
handling pass should deal with this gracefully, even though this could be
sub-optimal.

This is an initial fix for a problem observed in a game.
Subsequent better fix being worked on which should produce optimal code (by
removing redundant waterfall loops entirely if all the elements turn out to be
uniform anyway).

V6:
[AMDGPU] Improved handling of uniform waterfalls

This builds on a previous fix for uniform values being specified in waterfall
loop intrinsic groups.
Previous fix would just remove any waterfall.readfirstlane intrinsics if the
input was uniform.

This fix improves on this by spotting when ALL waterfall.readfirstlanes are
uniform and removing the waterfall loop entirely.

The tests have been extended to test this mode.

The update also handles the case where not all waterfall.readfirstlanes are
removed. This is also covered in the associated tests.

V7:
[AMDGCN] Re-order waterfall and wholequadmode passes

Running WholeQuadMode pass after waterfall can have bad consequences.
Changing the order improves the outcome - but there may still be problems in
some unusual cases.
For the known uses of the waterfall intrinsics this change will work for now.

V8:
waterfall test fixup

Change-Id: I95b4391b8c0570bd399d70ed64af355d13ef4c84
Summary:
The test shows a case where a full use is not in a subrange because the
subreg is undefined at the use point, but, after a coalesce, the
subrange incorrectly still does not include the use even though the
subreg is now defined at that point. The incorrect live range causes a
"subrange join unreachable" assert on a later coalesce.

This commit ensures that a subrange is extended to all uses that can be
reached from a definition.

V2: Completely different fix, instead of working round the problem in
the later coalesce that actually asserted.
V5: Fixed to not extend liveness to an undef use. Spun second test out
into its own D51257 as it is a completely different problem with the
same "subrange join unreachable" symptom.
V6: Ignore debug uses.
V7: Set up undefs correctly, to avoid generating invalid subranges.

Differential Revision: https://reviews.llvm.org/D49097

Change-Id: I172cfe16d360690e921ebe606d3e90dd1cdd1b71
eliminateUndefCopy was incorrectly eliminating a copy that was only a
partial write of the target. In the test case, the resulting incorrect
live range caused a subrange join unreachable later.

The test case does not fail for me until I have D49097.

Differential Revision: https://reviews.llvm.org/D51257

Change-Id: Ibb767d8a016934431660764b0194634e2113fea2
Summary:
findReachingDefs was incorrectly using its fast path of just blitting in
the live range extension when it did not see an undef itself but
encountered an undef live out of a predecessor block.

Now RegisterCoalescing calls extendToIndices (D49097), the bug was
causing an incorrect subrange which led to a "Couldn't join subrange!"
assert in a later coalesce.

Subscribers: MatzeB, jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D51574

Change-Id: I01c942b41ab5a145348289c8b221c45a2ec7200f
…vePartialRedundancy

Summary:
removePartialRedundancy was using extendToIndices on each subrange
without passing in an Undefs vector containing main reg defs that are
undef in the subrange, causing the above assert.

Unfortunately I can only reproduce this in an ll test. Turning it into a
mir test makes the problem go away.

Subscribers: MatzeB, qcolombet, jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D51849

Change-Id: I4f3ba2afb20d79f9467afb3d882f0328d38531be
…wCopyChain

valuesIdentical is called to determine whether a def can be considered
an "identical erase" because following copy chains proves that the value
being copied is the same value as the value of the other register.

The tests show up problems where the main range decides that a def is
an "identical erase" but a subrange disagrees, because following the
copy chain leads to a def that does not define all subregs in the
subrange (in the case of one test) or a different def to that found in
the main range (in the case of the other test).

The fix here is to only detect an "identical erase" in the main range.
If it is found, it is automatically inherited by the corresponding def
in any subrange, without further analysis of that subrange value.

This fix supersedes D49535/rL338070 and D51848.

Change-Id: I059b5b2273ed6f186d134b98c118fef0494e02f8
This is a temporary fix to avoid a breakage where the export(s) for a
pixel shader are in control flow (due to a kill). A pixel shader must
execute an export done, even with exec=0.

A better fix would be to have a pixel-shader-only pass that inserts a
null export done in the final basic block if there is not one already
there.

Change-Id: I22f201f95d52ff699aa8cac26bb019717b20432e
LLPC has .ll file library functions that break this verifier check. We
need to fix those before re-enabling the check.

Change-Id: I4c000793a805a50f1d91d1ae55c567a356e72519
This commit fixes a really fun bug with inst combine where you have a
vector-of-pointers and only one element of this vector is used.
InstCombine correctly notes that an element is not used, and then will
go back through a gep into that vector-of-pointers and set all vector
elements to undef. This is fine in all cases except that the langref
(and a ton of places in the optimizer) requires that indexing into a
struct has the same index for each element of the vector-of-pointers.

To fix the bug I've just made the gep/undef propagation check that the
current type being indexed into is not a struct when doing the undef
propagation.

Differential Revision: https://reviews.llvm.org/D60600

Change-Id: I6eba34b1cde9c14751c39f04627e39425639ab3b
…d non-divergent

Summary:
This fixes an issue where values are uniform inside of a loop but
the uses of that value outside the loop can be divergent.

This is a temporary fix until the library linker issues can be resolved
in llvm.

Change-Id: I94d3d2e30cc2a6ae8d59e92cadf6f1b6cb7e708b
Implement a pass, enabled by -amdgpu-scratch-bounds-checking,
or the subtarget feature "enable-scratch-bounds-checks",
which adds bounds checks to scratch accesses.
When this pass is enabled, out-of-bounds writes have no effect
and out-of-bounds reads return zero.
This is useful for GFX9 where hardware no longer performs
bounds checking on scratch accesses and hence page faults
result for out-of-bounds accesses by generated by shaders.

Change-Id: Id2ee4b1f32e70b6bde2541db755727b6a407721b
stripValuesNotDefiningMask was asserting that it did not leave an empty
subrange. However that was bogus because the subrange could have been
empty to start with, in the case that LiveRangeCalc::calculate saw a
subreg use that caused a subrange to be created empty.

Differential Revision: https://reviews.llvm.org/D63510

Change-Id: Ibf862415ea422198975cc7a2ca2d98531beec08d
…REL_OFFSET is 0"

This reverts commit 58b3837.

That commit was causing an unnecessary reloc when accessing a global in
the same section. On PAL, that breaks our use of a read-only global
variable as an optimization of a local variable with constant
initialization, because the PAL loader ignores the reloc.

Change-Id: I2792404227d08d98baee69504509828de7e9b36c
Implement appropriate register allocation and exec mask usage
for wave32 scenarios.

Change-Id: I844018f6c07fdda46366af51654017fb4b654d8a
Extended test to check wave32 and gfx10

Change-Id: I620c6d9e737ddccdfd3494abf72632d7ecc4a53f
Change-Id: I3557477344135f7dbece33a2945bcb9f4063c10f
Extend LoadStoreOptimizer to handle IMAGE_LOAD and IMAGE_SAMPLE.

Change-Id: Id49c992b9781254e39e1352125f5ffb212fe4f24
Summary:
Backend defaults to setting WGP mode to 0 or 1 depending on the cumode
feature. For PAL clients we want PAL to set this mode.
Adding check to prevent the backend overriding whatever the front end has set.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63639

Change-Id: Ia3bac3e562fb47089f05c11cae47d1c92f7195ee
Fix a breakage in areMemAccessesTriviallyDisjoint, caused
by trying to merge a non-load MIMG instruction IMAGE_GET_RESINFO.

Only MIMG that loads from memory should be considered.

Change-Id: I012435534b3e6161ee7feb686d4711097d2ab980
Currently the atomic optimizer does not support wave32,
and may generate DPP which is incompatible with GFX10.
This change allows the atomic optimizer to be enabled
unconditionally by the LLPC for all ASICs.

Change-Id: Ia26b527675b6e07319f4299b3f82ad405916bad6
Summary:
This fixes B42473 and B42706.

This patch makes the SDA propagate branch divergence until the end of the RPO traversal. Before, the SyncDependenceAnalysis propagated divergence only until the IPD in rpo order. RPO is incompatible with post dominance in the presence of loops. This made the SDA crash because blocks were missed in the propagation.

Reviewers: foad, nhaehnle

Subscribers: jvesely, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65274

Change-Id: Ic3fcc51e5850fde3a4649ec3dd84a040a07aaca9
Summary:
Add support for gfx10, where all DPP operations are confined to work
within a single row of 16 lanes, and wave32.

Reviewers: arsenm, sheredom, critson, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, dstuttard, tpr, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65644

Change-Id: I1f8991e9cf732e79a139e79e70a2bf62650693d4
…DD_REL_OFFSET is 0"

This reverts commit ad3ea6fe0eedc9fe48b96453cfed51f8c1dd79de.
…ord in SI_PC_ADD_REL_OFFSET is 0"

(From Jay)

D61491 caused us to use relocs when they're not strictly necessary, to
refer to symbols in the text section. This is a pessimization and it's a
problem for some loaders that don't support relocs yet.

Differential Revision: https://reviews.llvm.org/D65813

Change-Id: Ia84de963ad099b7d2fcfb9cd2b908fa2e5a4788b
Change-Id: I3dddc18d5dea0fd1050633e35d0ac463a9ed26ca
Guzhu-AMD and others added 2 commits January 13, 2022 22:53
…9632404

Based on upstream llvm : 4edb998 [SelectionDAG] treat X constrained labels as i for asm

Added AMD modification notices and removed some GPL files.

Change-Id: I46d09041fccf532e2c31320842d3f932e0d088a2
This is so that we can set/access these flags in other `.cpp` files.
// Enable conditional discard transformations
static cl::opt<bool> EnableConditionalDiscardTransformations(
opt<bool> EnableConditionalDiscardTransformations(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular objection, but I haven't seen this pattern before of putting the option itself (EnableConditionalDiscardTransformations) in the cl namespace. Is it used elsewhere in LLVM?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem to be used occasionally elsewhere in LLVM.
Exposing this option seems fine to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@kuhar kuhar Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could put it in namespace llvm instead of llvm::cl. Both work for me.

trenouf pushed a commit that referenced this pull request Feb 3, 2022
We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f2bb3d152e4 in ?? ()
#3  0x00007ffcc5f0cc80 in ?? ()
llvm#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
llvm#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
llvm#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
llvm#7  0x00007f2bb3d144ee in ?? ()
llvm#8  0x746e692f706d742f in ?? ()
llvm#9  0x692d747065637265 in ?? ()
llvm#10 0x2f653631326b3034 in ?? ()
llvm#11 0x646d632e35353532 in ?? ()
llvm#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ

Differential Revision: https://reviews.llvm.org/D118439
Guzhu-AMD pushed a commit that referenced this pull request Mar 4, 2022
This patch fixes a data race in IOHandlerProcessSTDIO. The race is
happens between the main thread and the event handling thread. The main
thread is running the IOHandler (IOHandlerProcessSTDIO::Run()) when an
event comes in that makes us pop the process IO handler which involves
cancelling the IOHandler (IOHandlerProcessSTDIO::Cancel). The latter
calls SetIsDone(true) which modifies m_is_done. At the same time, we
have the main thread reading the variable through GetIsDone().

This patch avoids the race by using a mutex to synchronize the two
threads. On the event thread, in IOHandlerProcessSTDIO ::Cancel method,
we obtain the lock before changing the value of m_is_done. On the main
thread, in IOHandlerProcessSTDIO::Run(), we obtain the lock before
reading the value of m_is_done. Additionally, we delay calling SetIsDone
until after the loop exists, to avoid a potential race between the two
writes.

  Write of size 1 at 0x00010b66bb68 by thread T7 (mutexes: write M2862, write M718324145051843688):
    #0 lldb_private::IOHandler::SetIsDone(bool) IOHandler.h:90 (liblldb.15.0.0git.dylib:arm64+0x971d84)
    #1 IOHandlerProcessSTDIO::Cancel() Process.cpp:4382 (liblldb.15.0.0git.dylib:arm64+0x5ddfec)
    #2 lldb_private::Debugger::PopIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1156 (liblldb.15.0.0git.dylib:arm64+0x3cb2a8)
    #3 lldb_private::Debugger::RemoveIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1063 (liblldb.15.0.0git.dylib:arm64+0x3cbd2c)
    llvm#4 lldb_private::Process::PopProcessIOHandler() Process.cpp:4487 (liblldb.15.0.0git.dylib:arm64+0x5c583c)
    llvm#5 lldb_private::Debugger::HandleProcessEvent(std::__1::shared_ptr<lldb_private::Event> const&) Debugger.cpp:1549 (liblldb.15.0.0git.dylib:arm64+0x3ceabc)
    llvm#6 lldb_private::Debugger::DefaultEventHandler() Debugger.cpp:1622 (liblldb.15.0.0git.dylib:arm64+0x3cf2c0)
    llvm#7 std::__1::__function::__func<lldb_private::Debugger::StartEventHandlerThread()::$_2, std::__1::allocator<lldb_private::Debugger::StartEventHandlerThread()::$_2>, void* ()>::operator()() function.h:352 (liblldb.15.0.0git.dylib:arm64+0x3d1bd8)
    llvm#8 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) HostNativeThreadBase.cpp:62 (liblldb.15.0.0git.dylib:arm64+0x4c71ac)
    llvm#9 lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(void*) HostThreadMacOSX.mm:18 (liblldb.15.0.0git.dylib:arm64+0x29ef544)

  Previous read of size 1 at 0x00010b66bb68 by main thread:
    #0 lldb_private::IOHandler::GetIsDone() IOHandler.h:92 (liblldb.15.0.0git.dylib:arm64+0x971db8)
    #1 IOHandlerProcessSTDIO::Run() Process.cpp:4339 (liblldb.15.0.0git.dylib:arm64+0x5ddc7c)
    #2 lldb_private::Debugger::RunIOHandlers() Debugger.cpp:982 (liblldb.15.0.0git.dylib:arm64+0x3cb48c)
    #3 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) CommandInterpreter.cpp:3298 (liblldb.15.0.0git.dylib:arm64+0x506478)
    llvm#4 lldb::SBDebugger::RunCommandInterpreter(bool, bool) SBDebugger.cpp:1166 (liblldb.15.0.0git.dylib:arm64+0x53604)
    llvm#5 Driver::MainLoop() Driver.cpp:634 (lldb:arm64+0x100006294)
    llvm#6 main Driver.cpp:853 (lldb:arm64+0x100007344)

Differential revision: https://reviews.llvm.org/D120762
trenouf pushed a commit that referenced this pull request May 3, 2022
…ified offset and its parents or children with spcified depth."

This reverts commit a3b7cb0.

symbol-offset.test fails under MSAN:

[  1] ; RUN: llvm-pdbutil yaml2pdb %p/Inputs/symbol-offset.yaml --pdb=%t.pdb [FAIL]
llvm-pdbutil yaml2pdb <REDACTED>/llvm/test/tools/llvm-pdbutil/Inputs/symbol-offset.yaml --pdb=<REDACTED>/tmp/symbol-offset.test/symbol-offset.test.tmp.pdb
==9283==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55f975e5eb91 in __libcpp_tls_set <REDACTED>/include/c++/v1/__threading_support:428:12
    #1 0x55f975e5eb91 in set_pointer <REDACTED>/include/c++/v1/thread:196:5
    #2 0x55f975e5eb91 in void* std::__msan::__thread_proxy<std::__msan::tuple<std::__msan::unique_ptr<std::__msan::__thread_struct, std::__msan::default_delete<std::__msan::__thread_struct> >, llvm::parallel::detail::(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(llvm::ThreadPoolStrategy)::'lambda'()::operator()() const::'lambda'()> >(void*) <REDACTED>/include/c++/v1/thread:285:27
    #3 0x7f74a1e55b54 in start_thread (<REDACTED>/libpthread.so.0+0xbb54) (BuildId: 64752de50ebd1a108f4b3f8d0d7e1a13)
    llvm#4 0x7f74a1dc9f7e in clone (<REDACTED>/libc.so.6+0x13cf7e) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)
trenouf pushed a commit that referenced this pull request May 3, 2022
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).

Unfortunately the implementation has two problems:

1) It didn't actually check for the result number and reject
   it.  parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
   operand parsing.  This also was functionally identical, but
   also had some subtle differences (e.g. the parseOptional
   stuff had a different result type).

I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand.  This keeps the
codepaths unified and adds the missing error checks.

Differential Revision: https://reviews.llvm.org/D124470
Guzhu-AMD pushed a commit that referenced this pull request May 12, 2022
The original fix (commit 23ec578) of
llvm#52787 only adds `Function`s
that have `Instruction`s that directly use `BlockAddress`es into the
bitcode (`FUNC_CODE_BLOCKADDR_USERS`).

However, in either @rickyz's original reproducing code:

```
void f(long);

__attribute__((noinline)) static void fun(long x) {
  f(x + 1);
}

void repro(void) {
  fun(({
    label:
      (long)&&label;
  }));
}
```

```
...
define dso_local void @repro() #0 {
entry:
  br label %label

label:                                            ; preds = %entry
  tail call fastcc void @fun()
  ret void
}

define internal fastcc void @fun() unnamed_addr #1 {
entry:
  tail call void @f(i64 add (i64 ptrtoint (i8* blockaddress(@repro, %label) to i64), i64 1)) #3
  ret void
}
...
```

or the xfs and overlayfs in the Linux kernel, `BlockAddress`es (e.g.,
`i8* blockaddress(@repro, %label)`) may first compose `ConstantExpr`s
(e.g., `i64 ptrtoint (i8* blockaddress(@repro, %label) to i64)`) and
then used by `Instruction`s. This case is not handled by the original
fix.

This patch adds *indirect* users of `BlockAddress`es, i.e., the
`Instruction`s using some `Constant`s which further use the
`BlockAddress`es, into the bitcode as well, by doing depth-first
searches.

Fixes: llvm#52787
Fixes: 23ec578 ("[Bitcode] materialize Functions early when BlockAddress taken")

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D124878
Guzhu-AMD pushed a commit that referenced this pull request Jun 2, 2022
This reverts commit c274b6e.

The x86_64 debian bot got a failure with this patch,
https://lab.llvm.org/buildbot#builders/68/builds/33078
where
SymbolFile/DWARF/x86/DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s
is crashing here -

 #2 0x0000000000425a9f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f57160e9140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140)
 llvm#4 0x00007f570d911e43 lldb_private::SourceManager::GetFile(lldb_private::FileSpec const&) crtstuff.c:0:0
 llvm#5 0x00007f570d914270 lldb_private::SourceManager::DisplaySourceLinesWithLineNumbers(lldb_private::FileSpec const&, unsigned int, unsigned int, unsigned int, unsigned int, char const*, lldb_private::Stream*, lldb_private::SymbolContextList const*) crtstuff.c:0:0
 llvm#6 0x00007f570da662c8 lldb_private::StackFrame::GetStatus(lldb_private::Stream&, bool, bool, bool, char const*) crtstuff.c:0:0

I don't get a failure here my mac, I'll review this method more
closely tomorrow.
Guzhu-AMD pushed a commit that referenced this pull request Jul 14, 2022
…ned form

The DWARF spec says:

 Any debugging information entry representing the declaration of an object,
 module, subprogram or type may have DW_AT_decl_file, DW_AT_decl_line and
 DW_AT_decl_column attributes, each of whose value is an unsigned integer
							 ^^^^^^^^
 constant.

If however, a producer happens to emit DW_AT_decl_file /
DW_AT_decl_line using a signed integer form, llvm-dwarfdump crashes,
like so:

     (... snip ...)
     0x000000b4:   DW_TAG_structure_type
                     DW_AT_name      ("test_struct")
                     DW_AT_byte_size (136)
                     DW_AT_decl_file (llvm-dwarfdump: (... snip ...)/llvm/include/llvm/ADT/Optional.h:197: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() &
 [with T = long unsigned int]: Assertion `hasVal' failed.
     PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
     Stack dump:
     0.      Program arguments: /opt/rocm/llvm/bin/llvm-dwarfdump ./testsuite/outputs/gdb.rocm/lane-pc-vega20/lane-pc-vega20-kernel.so
      #0 0x000055cc8e78315f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
      #1 0x000055cc8e780d3d SignalHandler(int) Signals.cpp:0:0
      #2 0x00007f8f2cae8420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
      #3 0x00007f8f2c58d00b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
      llvm#4 0x00007f8f2c56c859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7
      llvm#5 0x00007f8f2c56c729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8
      llvm#6 0x00007f8f2c56c729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34
      llvm#7 0x00007f8f2c57dfd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
      llvm#8 0x000055cc8e58ceb9 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2e0eb9)
      llvm#9 0x000055cc8e58bec3 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2dfec3)
     llvm#10 0x000055cc8e5b28a3 llvm::DWARFCompileUnit::dump(llvm::raw_ostream&, llvm::DIDumpOptions) (.part.21) DWARFCompileUnit.cpp:0:0

Likewise with DW_AT_call_file / DW_AT_call_line.

The problem is that the code in llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
dumping these attributes assumes that
FormValue.getAsUnsignedConstant() returns an armed optional.  If in
debug mode, we get an assertion line the above.  If in release mode,
and asserts are compiled out, then we proceed as if the optional had a
value, running into undefined behavior, printing whatever random
value.

Fix this by checking whether the optional returned by
FormValue.getAsUnsignedConstant() has a value, like done in other
places.

In addition, DWARFVerifier.cpp is validating DW_AT_call_file /
DW_AT_decl_file, but not AT_call_line / DW_AT_decl_line.  This commit
fixes that too.

The llvm-dwarfdump/X86/verify_file_encoding.yaml testcase is extended
to cover these cases.  Current llvm-dwarfdump crashes running the
newly-extended test.

"make check-llvm-tools-llvm-dwarfdump" shows no regressions, on x86-64
GNU/Linux.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D129392
Flakebi pushed a commit to Flakebi/amdvlk-llvm-project that referenced this pull request Dec 19, 2022
Another attempt to skip the fast-math linker test on powerpc. The test
has to be skipped because there is no crtfastmath.o on powerpc.

Change recommended by Amy Kwan <amyk>.

See https://reviews.llvm.org/D138675
trenouf pushed a commit that referenced this pull request Mar 8, 2023
For example, if you have a chain of inlined funtions like this:

   1 #include <stdlib.h>
   2 int g1 = 4, g2 = 6;
   3
   4 static inline void bar(int q) {
   5   if (q > 5)
   6     abort();
   7 }
   8
   9 static inline void foo(int q) {
  10   bar(q);
  11 }
  12
  13 int main() {
  14   foo(g1);
  15   foo(g2);
  16   return 0;
  17 }

with optimizations you could end up with a single abort call for the two
inlined instances of foo(). When merging the locations for those inlined
instances you would previously end up with a 0:0 location in main().
Leaving out that inlined chain from the location for the abort call
could make troubleshooting difficult in some cases.

This patch changes DILocation::getMergedLocation() to try to handle such
cases. The function is rewritten to first find a common starting point
for the two locations (same subprogram and inlined-at location), and
then in reverse traverses the inlined-at chain looking for matches in
each subprogram. For each subprogram, the merge function will find the
nearest common scope for the two locations, and matching line and
column (or set them to 0 if not matching).

In the example above, you will for the abort call get a location in
bar() at 6:5, inlined in foo() at 10:3, inlined in main() at 0:0 (since
the two inlined functions are on different lines, but in the same
scope).

I have not seen anything in the DWARF standard that would disallow
inlining a non-zero location at 0:0 in the inlined-at function, and both
LLDB and GDB seem to accept these locations (with D142552 needed for
LLDB to handle cases where the file, line and column number are all 0).
One incompatibility with GDB is that it seems to ignore 0-line locations
in some cases, but I am not aware of any specific issue that this patch
produces related to that.

With x86-64 LLDB (trunk) you previously got:

  frame #0: 0x00007ffff7a44930 libc.so.6`abort
  frame #1: 0x00005555555546ec a.out`main at merge.c:0

and will now get:

  frame #0: 0x[...] libc.so.6`abort
  frame #1: 0x[...] a.out`main [inlined] bar(q=<unavailable>) at merge.c:6:5
  frame #2: 0x[...] a.out`main [inlined] foo(q=<unavailable>) at merge.c:10:3
  frame #3: 0x[...] a.out`main at merge.c:0

and with x86-64 GDB (11.1) you will get:

  (gdb) bt
  #0  0x00007ffff7a44930 in abort () from /lib64/libc.so.6
  #1  0x00005555555546ec in bar (q=<optimized out>) at merge.c:6
  #2  foo (q=<optimized out>) at merge.c:10
  #3  0x00005555555546ec in main ()

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D142556
Guzhu-AMD pushed a commit that referenced this pull request Mar 29, 2023
This change prevents rare deadlocks observed for specific macOS/iOS GUI
applications which issue many `dlopen()` calls from multiple different
threads at startup and where TSan finds and reports a race during
startup.  Providing a reliable test for this has been deemed infeasible.

Although I've only observed this deadlock on Apple platforms,
conceptually the cause is not confined to Apple code so the fix lives in
platform-independent code.

Deadlock scenario:
```
Thread 2                    | Thread 4
ReportRace()                |
Lock internal TSan mutexes  |
  &ctx->slot_mtx            |
                            | dlopen() interceptor
                            | OnLibraryLoaded()
                            | MemoryMappingLayout::DumpListOfModules()
                            | calls dyld API, which takes internal lock
                            | lock() interceptor
                            | TSan tries to take internal mutexes again
                            |   &ctx->slot_mtx
call into symbolizer        |
MemoryMappingLayout::DumpListOfModules()
calls dyld API, which hangs on trying to take lock
```
Resulting in:
* Thread 2 has internal TSan mutex, blocked on dyld lock
* Thread 4 has dyld lock, blocked on internal TSan mutex

The fix prevents this situation by not intercepting any of the calls
originating from `MemoryMappingLayout::DumpListOfModules()`.

Stack traces for deadlock between ReportRace() and dlopen() interceptor:
```
thread #2, queue = 'com.apple.root.default-qos'
  frame #0: libsystem_kernel.dylib
  frame #1: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=<unavailable>, options=<unavailable>) at tsan_interceptors_mac.cpp:306:3
  frame #2: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814523c0) block_pointer) at DyldRuntimeState.cpp:227:28
  frame #3: dyld`dyld4::APIs::_dyld_get_image_header(this=0x0000000101012a20, imageIndex=614) at DyldAPIs.cpp:240:11
  frame llvm#4: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::CurrentImageHeader(this=<unavailable>) at sanitizer_procmaps_mac.cpp:391:35
  frame llvm#5: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f2a2800, segment=0x000000016f2a2738) at sanitizer_procmaps_mac.cpp:397:51
  frame llvm#6: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f2a2800, modules=0x00000001011000a0) at sanitizer_procmaps_mac.cpp:460:10
  frame llvm#7: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x00000001011000a0) at sanitizer_mac.cpp:610:18
  frame llvm#8: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(unsigned long) [inlined] __sanitizer::Symbolizer::RefreshModules(this=0x0000000101100078) at sanitizer_symbolizer_libcdep.cpp:185:12
  frame llvm#9: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(this=0x0000000101100078, address=6465454512) at sanitizer_symbolizer_libcdep.cpp:204:5
  frame llvm#10: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::SymbolizePC(this=0x0000000101100078, addr=6465454512) at sanitizer_symbolizer_libcdep.cpp:88:15
  frame llvm#11: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeCode(addr=6465454512) at tsan_symbolize.cpp:106:35
  frame llvm#12: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeStack(trace=StackTrace @ 0x0000600002d66d00) at tsan_rtl_report.cpp:112:28
  frame llvm#13: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedReportBase::AddMemoryAccess(this=0x000000016f2a2a90, addr=4381057136, external_tag=<unavailable>, s=<unavailable>, tid=<unavailable>, stack=<unavailable>, mset=0x00000001012fc310) at tsan_rtl_report.cpp:190:16
  frame llvm#14: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=<unavailable>, old=<unavailable>, typ0=1) at tsan_rtl_report.cpp:795:9
  frame llvm#15: libclang_rt.tsan_osx_dynamic.dylib`__tsan::DoReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=Shadow @ x22, old=Shadow @ 0x0000600002d6b4f0, typ=1) at tsan_rtl_access.cpp:166:3
  frame llvm#16: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) at tsan_rtl_access.cpp:220:5
  frame llvm#17: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) [inlined] __tsan::MemoryAccess(thr=0x00000001012fc000, pc=<unavailable>, addr=<unavailable>, size=8, typ=1) at tsan_rtl_access.cpp:442:3
  frame llvm#18: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(addr=<unavailable>) at tsan_interface.inc:34:3
  <call into TSan from from instrumented code>

thread llvm#4, queue = 'com.apple.dock.fullscreen'
  frame #0:  libsystem_kernel.dylib
  frame #1:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::FutexWait(p=<unavailable>, cmp=<unavailable>) at sanitizer_mac.cpp:540:3
  frame #2:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Semaphore::Wait(this=<unavailable>) at sanitizer_mutex.cpp:35:7
  frame #3:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Mutex::Lock(this=0x0000000102992a80) at sanitizer_mutex.h:196:18
  frame llvm#4:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:383:10
  frame llvm#5:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:382:77
  frame llvm#6:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() at tsan_rtl.h:708:10
  frame llvm#7:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::TryTraceFunc(thr=0x000000010f084000, pc=0) at tsan_rtl.h:751:7
  frame llvm#8:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::FuncExit(thr=0x000000010f084000) at tsan_rtl.h:798:7
  frame llvm#9:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=0x000000016f3ba280) at tsan_interceptors_posix.cpp:300:5
  frame llvm#10: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=<unavailable>) at tsan_interceptors_posix.cpp:293:41
  frame llvm#11: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=0x000000016f21b1e8, options=OS_UNFAIR_LOCK_NONE) at tsan_interceptors_mac.cpp:310:1
  frame llvm#12: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814525d4) block_pointer) at DyldRuntimeState.cpp:227:28
  frame llvm#13: dyld`dyld4::APIs::_dyld_get_image_vmaddr_slide(this=0x0000000101012a20, imageIndex=412) at DyldAPIs.cpp:273:11
  frame llvm#14: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(__sanitizer::MemoryMappedSegment*) at sanitizer_procmaps_mac.cpp:286:17
  frame llvm#15: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f3ba560, segment=0x000000016f3ba498) at sanitizer_procmaps_mac.cpp:432:15
  frame llvm#16: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f3ba560, modules=0x000000016f3ba618) at sanitizer_procmaps_mac.cpp:460:10
  frame llvm#17: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x000000016f3ba618) at sanitizer_mac.cpp:610:18
  frame llvm#18: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::LibIgnore::OnLibraryLoaded(this=0x0000000101f3aa40, name="<some library>") at sanitizer_libignore.cpp:54:11
  frame llvm#19: libclang_rt.tsan_osx_dynamic.dylib`::wrap_dlopen(filename="<some library>", flag=<unavailable>) at sanitizer_common_interceptors.inc:6466:3
  <library code>
```

rdar://106766395

Differential Revision: https://reviews.llvm.org/D146593
Guzhu-AMD pushed a commit that referenced this pull request Jun 1, 2023
…est unittest

Need to finalize the DIBuilder to avoid leak sanitizer errors
like this:

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x55c99ea1761d in operator new(unsigned long)
    #1 0x55c9a518ae49 in operator new
    #2 0x55c9a518ae49 in llvm::MDTuple::getImpl(...)
    #3 0x55c9a4f1b1ec in getTemporary
    llvm#4 0x55c9a4f1b1ec in llvm::DIBuilder::createFunction(...)
Guzhu-AMD pushed a commit that referenced this pull request Jul 10, 2023
Running this on Amazon Ubuntu the final backtrace is:
```
(lldb) thread backtrace
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
  * frame #0: 0x0000aaaaaaaa07d0 a.out`func_c at main.c:10:3
    frame #1: 0x0000aaaaaaaa07c4 a.out`func_b at main.c:14:3
    frame #2: 0x0000aaaaaaaa07b4 a.out`func_a at main.c:18:3
    frame #3: 0x0000aaaaaaaa07a4 a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:22:3
    frame llvm#4: 0x0000fffff7b373fc libc.so.6`___lldb_unnamed_symbol2962 + 108
    frame llvm#5: 0x0000fffff7b374cc libc.so.6`__libc_start_main + 152
    frame llvm#6: 0x0000aaaaaaaa06b0 a.out`_start + 48
```
This causes the test to fail because of the extra ___lldb_unnamed_symbol2962 frame
(an inlined function?).

To fix this, strictly check all the frames in main.c then for the rest
just check we find __libc_start_main and _start in that order regardless
of other frames in between.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D154204
trenouf pushed a commit that referenced this pull request Jul 18, 2023
…tput

The crash happens in clang::driver::tools::SplitDebugName when Output is
InputInfo::Nothing. It doesn't happen with standalone clang driver because
output is created in Driver::BuildJobsForActionNoCache.

Example backtrace:
```
* thread #1, name = 'clangd', stop reason = hit program assert
  * frame #0: 0x00007ffff5c4eacf libc.so.6`raise + 271
    frame #1: 0x00007ffff5c21ea5 libc.so.6`abort + 295
    frame #2: 0x00007ffff5c21d79 libc.so.6`__assert_fail_base.cold.0 + 15
    frame #3: 0x00007ffff5c47426 libc.so.6`__assert_fail + 70
    frame llvm#4: 0x000055555dc0923c clangd`clang::driver::InputInfo::getFilename(this=0x00007fffffff9398) const at InputInfo.h:84:5
    frame llvm#5: 0x000055555dcd0d8d clangd`clang::driver::tools::SplitDebugName(JA=0x000055555f6c6a50, Args=0x000055555f6d0b80, Input=0x00007fffffff9678, Output=0x00007fffffff9398) at CommonArgs.cpp:1275:40
    frame llvm#6: 0x000055555dc955a5 clangd`clang::driver::tools::Clang::ConstructJob(this=0x000055555f6c69d0, C=0x000055555f6c64a0, JA=0x000055555f6c6a50, Output=0x00007fffffff9398, Inputs=0x00007fffffff9668, Args=0x000055555f6d0b80, LinkingOutput=0x0000000000000000) const at Clang.cpp:5690:33
    frame llvm#7: 0x000055555dbf6b54 clangd`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5618:10
    frame llvm#8: 0x000055555dbf4ef0 clangd`clang::driver::Driver::BuildJobsForAction(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5306:26
    frame llvm#9: 0x000055555dbeb590 clangd`clang::driver::Driver::BuildJobs(this=0x00007fffffffb5e0, C=0x000055555f6c64a0) const at Driver.cpp:4844:5
    frame llvm#10: 0x000055555dbe6b0f clangd`clang::driver::Driver::BuildCompilation(this=0x00007fffffffb5e0, ArgList=ArrayRef<const char *> @ 0x00007fffffffb268) at Driver.cpp:1496:3
    frame llvm#11: 0x000055555b0cc0d9 clangd`clang::createInvocation(ArgList=ArrayRef<const char *> @ 0x00007fffffffbb38, Opts=CreateInvocationOptions @ 0x00007fffffffbb90) at CreateInvocationFromCommandLine.cpp:53:52
    frame llvm#12: 0x000055555b378e7b clangd`clang::clangd::buildCompilerInvocation(Inputs=0x00007fffffffca58, D=0x00007fffffffc158, CC1Args=size=0) at Compiler.cpp:116:44
    frame llvm#13: 0x000055555895a6c8 clangd`clang::clangd::(anonymous namespace)::Checker::buildInvocation(this=0x00007fffffffc760, TFS=0x00007fffffffe570, Contents= Has Value=false ) at Check.cpp:212:9
    frame llvm#14: 0x0000555558959cec clangd`clang::clangd::check(File=(Data = "build/test.cpp", Length = 64), TFS=0x00007fffffffe570, Opts=0x00007fffffffe600) at Check.cpp:486:34
    frame llvm#15: 0x000055555892164a clangd`main(argc=4, argv=0x00007fffffffecd8) at ClangdMain.cpp:993:12
    frame llvm#16: 0x00007ffff5c3ad85 libc.so.6`__libc_start_main + 229
    frame llvm#17: 0x00005555585bbe9e clangd`_start + 46
```

Test Plan: ninja ClangDriverTests && tools/clang/unittests/Driver/ClangDriverTests

Differential Revision: https://reviews.llvm.org/D154602
trenouf pushed a commit that referenced this pull request Aug 11, 2023
TSan reports the following data race:

  Write of size 4 at 0x000109e0b160 by thread T2 (mutexes: write M0, write M1):
    #0 NativeFile::Close() File.cpp:329
    #1 ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:232
    #2 Communication::Disconnect(lldb_private::Status*) Communication.cpp:61
    #3 process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1164
    llvm#4 Process::SetExitStatus(int, char const*) Process.cpp:1097
    llvm#5 process_gdb_remote::ProcessGDBRemote::MonitorDebugserverProcess(...) ProcessGDBRemote.cpp:3387

  Previous read of size 4 at 0x000109e0b160 by main thread (mutexes: write M2):
    #0 NativeFile::IsValid() const File.h:393
    #1 ConnectionFileDescriptor::IsConnected() const ConnectionFileDescriptorPosix.cpp:121
    #2 Communication::IsConnected() const Communication.cpp:79
    #3 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:256
    llvm#4 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...l) GDBRemoteCommunication.cpp:244
    llvm#5 process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponseNoLock(llvm::StringRef, StringExtractorGDBRemote&) GDBRemoteClientBase.cpp:246

The problem is that in WaitForPacketNoLock's run loop, it checks that
the connection is still connected. This races with the
ConnectionFileDescriptor disconnecting. Most (but not all) access to the
IOObject in ConnectionFileDescriptorPosix is already gated by the mutex.
This patch just protects IsConnected in the same way.

Differential revision: https://reviews.llvm.org/D157347
Guzhu-AMD pushed a commit that referenced this pull request Aug 24, 2023
Thread sanitizer reports the following data race:

```
WARNING: ThreadSanitizer: data race (pid=43201)
  Write of size 4 at 0x00010520c474 by thread T1 (mutexes: write M0, write M1):
    #0 lldb_private::PipePosix::CloseWriteFileDescriptor() PipePosix.cpp:242 (liblldb.18.0.0git.dylib:arm64+0x414700) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #1 lldb_private::PipePosix::Close() PipePosix.cpp:217 (liblldb.18.0.0git.dylib:arm64+0x4144e8) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #2 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:239 (liblldb.18.0.0git.dylib:arm64+0x40a620) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #3 lldb_private::Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 (liblldb.18.0.0git.dylib:arm64+0x2a9318) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    llvm#4 lldb_private::process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1167 (liblldb.18.0.0git.dylib:arm64+0x8ed984) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)

  Previous read of size 4 at 0x00010520c474 by main thread (mutexes: write M2, write M3):
    #0 lldb_private::PipePosix::CanWrite() const PipePosix.cpp:229 (liblldb.18.0.0git.dylib:arm64+0x4145e4) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #1 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:212 (liblldb.18.0.0git.dylib:arm64+0x40a4a8) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #2 lldb_private::Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 (liblldb.18.0.0git.dylib:arm64+0x2a9318) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    #3 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(StringExtractorGDBRemote&, lldb_private::Timeout<std::__1::ratio<1l, 1000000l>>, bool) GDBRemoteCommunication.cpp:373 (liblldb.18.0.0git.dylib:arm64+0x8b9c48) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
    llvm#4 lldb_private::process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(StringExtractorGDBRemote&, lldb_private::Timeout<std::__1::ratio<1l, 1000000l>>, bool) GDBRemoteCommunication.cpp:243 (liblldb.18.0.0git.dylib:arm64+0x8b9904) (BuildId: 2983976beb2637b5943bff32fd12eb8932000000200000000100000000000e00)
```

Fix this by adding a mutex to PipePosix.

Differential Revision: https://reviews.llvm.org/D157654
Guzhu-AMD pushed a commit that referenced this pull request Aug 31, 2023
This reverts commit 0e63f1a.

clang-format started to crash with contents like:
a.h:
```
```
$ clang-format a.h
```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ../llvm/build/bin/clang-format a.h
 #0 0x0000560b689fe177 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000560b689fbfbe llvm::sys::RunSignalHandlers() /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000560b689feaca SignalHandler(int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f030405a540 (/lib/x86_64-linux-gnu/libc.so.6+0x3c540)
 llvm#4 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/include/clang/Lex/Token.h:98:44
 llvm#5 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:562:51
 llvm#6 0x0000560b68a9a980 startsSequenceInternal<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:831:9
 llvm#7 0x0000560b68a9a980 startsSequence<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:600:12
 llvm#8 0x0000560b68a9a980 getFunctionName /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3131:17
 llvm#9 0x0000560b68a9a980 clang::format::TokenAnnotator::annotate(clang::format::AnnotatedLine&) /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3191:17
Segmentation fault
```
Guzhu-AMD pushed a commit that referenced this pull request Oct 12, 2023
…fine.parallel verifier

This patch updates AffineParallelOp::verify() to check each result type matches
its corresponding reduction op (i.e, the result type must be a `FloatType` if
the reduction attribute is `addf`)

affine.parallel will crash on --lower-affine if the corresponding result type
cannot match the reduction attribute.

```
      %128 = affine.parallel (%arg2, %arg3) = (0, 0) to (8, 7) reduce ("maxf") -> (memref<8x7xf32>) {
        %alloc_33 = memref.alloc() : memref<8x7xf32>
        affine.yield %alloc_33 : memref<8x7xf32>
      }
```
This will crash and report a type conversion issue when we run `mlir-opt --lower-affine`

```
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 572.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: mlir-opt --lower-affine temp.mlir
 #0 0x0000000102a18f18 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/workspacebin/mlir-opt+0x1002f8f18)
 #1 0x0000000102a171b4 llvm::sys::RunSignalHandlers() (/workspacebin/mlir-opt+0x1002f71b4)
 #2 0x0000000102a195c4 SignalHandler(int) (/workspacebin/mlir-opt+0x1002f95c4)
 #3 0x00000001be7894c4 (/usr/lib/system/libsystem_platform.dylib+0x1803414c4)
 llvm#4 0x00000001be771ee0 (/usr/lib/system/libsystem_pthread.dylib+0x180329ee0)
 llvm#5 0x00000001be6ac340 (/usr/lib/system/libsystem_c.dylib+0x180264340)
 llvm#6 0x00000001be6ab754 (/usr/lib/system/libsystem_c.dylib+0x180263754)
 llvm#7 0x0000000106864790 mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (.cold.4) (/workspacebin/mlir-opt+0x104144790)
 llvm#8 0x0000000102ba66ac mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x1004866ac)
 llvm#9 0x0000000102ba6910 mlir::arith::getIdentityValue(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x100486910)
...
```

Fixes llvm#64068

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D157985
Guzhu-AMD pushed a commit that referenced this pull request Nov 2, 2023
…tePluginObject

After llvm#68052 this function changed from returning
a nullptr with `return {};` to returning Expected and hitting `llvm_unreachable` before it could
do so.

I gather that we're never supposed to call this function, but on Windows we actually do call
this function because `interpreter->CreateScriptedProcessInterface()` returns
`ScriptedProcessInterface` not `ScriptedProcessPythonInterface`. Likely because
`target_sp->GetDebugger().GetScriptInterpreter()` also does not return a Python related class.

The previously XFAILed test crashed with:
```
 # .---command stderr------------
 # | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
 # | Stack dump:
 # | 0.  Program arguments: c:\\users\\tcwg\\david.spickett\\build-llvm\\bin\\lldb-test.exe ir-memory-map C:\\Users\\tcwg\\david.spickett\\build-llvm\\tools\\lldb\\test\\Shell\\Expr\\Output\\TestIRMemoryMapWindows.test.tmp C:\\Users\\tcwg\\david.spickett\\llvm-project\\lldb\\test\\Shell\\Expr/Inputs/ir-memory-map-basic
 # | 1.  HandleCommand(command = "run")
 # | Exception Code: 0xC000001D
 # | #0 0x00007ff696b5f588 lldb_private::ScriptedProcessInterface::CreatePluginObject(class llvm::StringRef, class lldb_private::ExecutionContext &, class std::shared_ptr<class lldb_private::StructuredData::Dictionary>, class lldb_private::StructuredData::Generic *) C:\Users\tcwg\david.spickett\llvm-project\lldb\include\lldb\Interpreter\Interfaces\ScriptedProcessInterface.h:28:0
 # | #1 0x00007ff696b1d808 llvm::Expected<std::shared_ptr<lldb_private::StructuredData::Generic> >::operator bool C:\Users\tcwg\david.spickett\llvm-project\llvm\include\llvm\Support\Error.h:567:0
 # | #2 0x00007ff696b1d808 lldb_private::ScriptedProcess::ScriptedProcess(class std::shared_ptr<class lldb_private::Target>, class std::shared_ptr<class lldb_private::Listener>, class lldb_private::ScriptedMetadata const &, class lldb_private::Status &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Plugins\Process\scripted\ScriptedProcess.cpp:115:0
 # | #3 0x00007ff696b1d124 std::shared_ptr<lldb_private::ScriptedProcess>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1478:0
 # | llvm#4 0x00007ff696b1d124 lldb_private::ScriptedProcess::CreateInstance(class std::shared_ptr<class lldb_private::Target>, class std::shared_ptr<class lldb_private::Listener>, class lldb_private::FileSpec const *, bool) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Plugins\Process\scripted\ScriptedProcess.cpp:61:0
 # | llvm#5 0x00007ff69699c8f4 std::_Ptr_base<lldb_private::Process>::_Move_construct_from C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1237:0
 # | llvm#6 0x00007ff69699c8f4 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1534:0
 # | llvm#7 0x00007ff69699c8f4 std::shared_ptr<lldb_private::Process>::operator= C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1594:0
 # | llvm#8 0x00007ff69699c8f4 lldb_private::Process::FindPlugin(class std::shared_ptr<class lldb_private::Target>, class llvm::StringRef, class std::shared_ptr<class lldb_private::Listener>, class lldb_private::FileSpec const *, bool) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Target\Process.cpp:396:0
 # | llvm#9 0x00007ff6969bd708 std::_Ptr_base<lldb_private::Process>::_Move_construct_from C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1237:0
 # | llvm#10 0x00007ff6969bd708 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1534:0
 # | llvm#11 0x00007ff6969bd708 std::shared_ptr<lldb_private::Process>::operator= C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1594:0
 # | llvm#12 0x00007ff6969bd708 lldb_private::Target::CreateProcess(class std::shared_ptr<class lldb_private::Listener>, class llvm::StringRef, class lldb_private::FileSpec const *, bool) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Target\Target.cpp:215:0
 # | llvm#13 0x00007ff696b13af0 std::_Ptr_base<lldb_private::Process>::_Ptr_base C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1230:0
 # | llvm#14 0x00007ff696b13af0 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1524:0
 # | llvm#15 0x00007ff696b13af0 lldb_private::PlatformWindows::DebugProcess(class lldb_private::ProcessLaunchInfo &, class lldb_private::Debugger &, class lldb_private::Target &, class lldb_private::Status &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Plugins\Platform\Windows\PlatformWindows.cpp:495:0
 # | llvm#16 0x00007ff6969cf590 std::_Ptr_base<lldb_private::Process>::_Move_construct_from C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1237:0
 # | llvm#17 0x00007ff6969cf590 std::shared_ptr<lldb_private::Process>::shared_ptr C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1534:0
 # | llvm#18 0x00007ff6969cf590 std::shared_ptr<lldb_private::Process>::operator= C:\Program Files\Microsoft Visual Studio\2022\Preview\VC\Tools\MSVC\14.35.32124\include\memory:1594:0
 # | llvm#19 0x00007ff6969cf590 lldb_private::Target::Launch(class lldb_private::ProcessLaunchInfo &, class lldb_private::Stream *) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Target\Target.cpp:3274:0
 # | llvm#20 0x00007ff696fff82c CommandObjectProcessLaunch::DoExecute(class lldb_private::Args &, class lldb_private::CommandReturnObject &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Commands\CommandObjectProcess.cpp:258:0
 # | llvm#21 0x00007ff696fab6c0 lldb_private::CommandObjectParsed::Execute(char const *, class lldb_private::CommandReturnObject &) C:\Users\tcwg\david.spickett\llvm-project\lldb\source\Interpreter\CommandObject.cpp:751:0
 # `-----------------------------
 # error: command failed with exit status: 0xc000001d
```

That might be a bug on the Windows side, or an artifact of how our build is setup,
but whatever it is, having `CreatePluginObject` return an error and
the caller check it, fixes the failing test.

The built lldb can run the script command to use Python, but I'm not sure if that means
anything.
Guzhu-AMD pushed a commit that referenced this pull request Dec 15, 2023
… on (llvm#74207)

lld string tail merging interacts badly with ASAN on Windows, as is
reported in llvm#62078.
A similar error was found when building LLVM with
`-DLLVM_USE_SANITIZER=Address`:
```console
[2/2] Building GenVT.inc...
FAILED: include/llvm/CodeGen/GenVT.inc C:/Dev/llvm-project/Build_asan/include/llvm/CodeGen/GenVT.inc
cmd.exe /C "cd /D C:\Dev\llvm-project\Build_asan && C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe -gen-vt -I C:/Dev/llvm-project/llvm/include/llvm/CodeGen -IC:/Dev/llvm-project/Build_asan/include -IC:/Dev/llvm-project/llvm/include C:/Dev/llvm-project/llvm/include/llvm/CodeGen/ValueTypes.td --write-if-changed -o include/llvm/CodeGen/GenVT.inc -d include/llvm/CodeGen/GenVT.inc.d"       
=================================================================
==31944==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7ff6cff80d20 at pc 0x7ff6cfcc7378 bp 0x00e8bcb8e990 sp 0x00e8bcb8e9d8
READ of size 1 at 0x7ff6cff80d20 thread T0
    #0 0x7ff6cfcc7377 in strlen (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1400a7377)
    #1 0x7ff6cfde50c2 in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1401c50c2)
    #2 0x7ff6cfdd75ef in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1401b75ef)
    #3 0x7ff6cfde59f9 in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1401c59f9)
    llvm#4 0x7ff6cff03f6c in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1402e3f6c)
    llvm#5 0x7ff6cfefbcbc in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1402dbcbc)
    llvm#6 0x7ffb7f247343  (C:\WINDOWS\System32\KERNEL32.DLL+0x180017343)
    llvm#7 0x7ffb800826b0  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800526b0)

0x7ff6cff80d20 is located 31 bytes after global variable '"#error \"ArgKind is not defined\"\n"...' defined in 'C:\Dev\llvm-project\llvm\utils\TableGen\IntrinsicEmitter.cpp' (0x7ff6cff80ce0) of size 33
  '"#error \"ArgKind is not defined\"\n"...' is ascii string '#error "ArgKind is not defined"
'
0x7ff6cff80d20 is located 0 bytes inside of global variable '""' defined in 'C:\Dev\llvm-project\llvm\utils\TableGen\IntrinsicEmitter.cpp' (0x7ff6cff80d20) of size 1
  '""' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1400a7377) in strlen
Shadow bytes around the buggy address:
  0x7ff6cff80a80: 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 01 f9 f9 f9
  0x7ff6cff80b00: f9 f9 f9 f9 00 00 00 00 00 00 00 00 01 f9 f9 f9
  0x7ff6cff80b80: f9 f9 f9 f9 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9
  0x7ff6cff80c00: 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x7ff6cff80c80: 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
=>0x7ff6cff80d00: 01 f9 f9 f9[f9]f9 f9 f9 00 00 00 00 00 00 00 00
  0x7ff6cff80d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==31944==ABORTING
```
This is reproducible with the 17.0.3 release:
```console
$ clang-cl --version
clang version 17.0.3
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\LLVM\bin
$ cmake -S llvm -B Build -G Ninja -DLLVM_USE_SANITIZER=Address -DCMAKE_C_COMPILER=clang-cl -DCMAKE_CXX_COMPILER=clang-cl -DCMAKE_MSVC_RUNTIME_LIBRARY=MultiThreaded -DCMAKE_BUILD_TYPE=Release
$ cd Build
$ ninja all
```
Guzhu-AMD pushed a commit that referenced this pull request Dec 21, 2023
Internal builds of the unittests with msan flagged mempcpy_test.

    ==6862==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55e34d7d734a in length
llvm-project/libc/src/__support/CPP/string_view.h:41:11
#1 0x55e34d7d734a in string_view
llvm-project/libc/src/__support/CPP/string_view.h:71:24
#2 0x55e34d7d734a in
__llvm_libc_9999_0_0_git::testing::Test::testStrEq(char const*, char
const*, char const*, char const*,
__llvm_libc_9999_0_0_git::testing::internal::Location)
llvm-project/libc/test/UnitTest/LibcTest.cpp:284:13
#3 0x55e34d7d4e09 in LlvmLibcMempcpyTest_Simple::Run()
llvm-project/libc/test/src/string/mempcpy_test.cpp:20:3
llvm#4 0x55e34d7d6dff in
__llvm_libc_9999_0_0_git::testing::Test::runTests(char const*)
llvm-project/libc/test/UnitTest/LibcTest.cpp:133:8
llvm#5 0x55e34d7d86e0 in main
llvm-project/libc/test/UnitTest/LibcTestMain.cpp:21:10

SUMMARY: MemorySanitizer: use-of-uninitialized-value
llvm-project/libc/src/__support/CPP/string_view.h:41:11 in length

What's going on here is that mempcpy_test.cpp's Simple test is using
ASSERT_STREQ with a partially initialized char array. ASSERT_STREQ calls
Test::testStrEq which constructs a cpp:string_view. That constructor
calls the
private method cpp::string_view::length. When built with msan, the loop
is
transformed into multi-byte access, which then fails upon access.

I took a look at libc++'s __constexpr_strlen which just calls
__builtin_strlen(). Replacing the implementation of
cpp::string_view::length
with a call to __builtin_strlen() may still result in out of bounds
access when
the test is built with msan.

It's not safe to use ASSERT_STREQ with a partially initialized array.
Initialize the whole array so that the test passes.
Guzhu-AMD pushed a commit that referenced this pull request Dec 21, 2023
We'd like a way to select the current thread by its thread ID (rather
than its internal LLDB thread index).

This PR adds a `-t` option (`--thread_id` long option) that tells the
`thread select` command to interpret the `<thread-index>` argument as a
thread ID.

Here's an example of it working:
```
michristensen@devbig356 llvm/llvm-project (thread-select-tid) » ../Debug/bin/lldb ~/scratch/cpp/threading/a.out
(lldb) target create "/home/michristensen/scratch/cpp/threading/a.out"
Current executable set to '/home/michristensen/scratch/cpp/threading/a.out' (x86_64).
(lldb) b 18
Breakpoint 1: where = a.out`main + 80 at main.cpp:18:12, address = 0x0000000000000850
(lldb) run
Process 215715 launched: '/home/michristensen/scratch/cpp/threading/a.out' (x86_64)
This is a thread, i=1
This is a thread, i=2
This is a thread, i=3
This is a thread, i=4
This is a thread, i=5
Process 215715 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555400850 a.out`main at main.cpp:18:12
   15     for (int i = 0; i < 5; i++) {
   16       pthread_create(&thread_ids[i], NULL, foo, NULL);
   17     }
-> 18     for (int i = 0; i < 5; i++) {
   19       pthread_join(thread_ids[i], NULL);
   20     }
   21     return 0;
(lldb) thread select 2
* thread #2, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread info
thread #2: tid = 216047, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'

(lldb) thread list
Process 215715 stopped
  thread #1: tid = 215715, 0x0000555555400850 a.out`main at main.cpp:18:12, name = 'a.out', stop reason = breakpoint 1.1
* thread #2: tid = 216047, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread #3: tid = 216048, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread llvm#4: tid = 216049, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread llvm#5: tid = 216050, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread llvm#6: tid = 216051, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
(lldb) thread select 215715
error: invalid thread #215715.
(lldb) thread select -t 215715
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555400850 a.out`main at main.cpp:18:12
   15     for (int i = 0; i < 5; i++) {
   16       pthread_create(&thread_ids[i], NULL, foo, NULL);
   17     }
-> 18     for (int i = 0; i < 5; i++) {
   19       pthread_join(thread_ids[i], NULL);
   20     }
   21     return 0;
(lldb) thread select -t 216051
* thread llvm#6, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread select 3
* thread #3, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread select -t 216048
* thread #3, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread select --thread_id 216048
* thread #3, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) help thread select
Change the currently selected thread.

Syntax: thread select <cmd-options> <thread-index>

Command Options Usage:
  thread select [-t] <thread-index>

       -t ( --thread_id )
            Provide a thread ID instead of a thread index.

     This command takes options and free-form arguments.  If your arguments
     resemble option specifiers (i.e., they start with a - or --), you must use
     ' -- ' between the end of the command options and the beginning of the
     arguments.
(lldb) c
Process 215715 resuming
Process 215715 exited with status = 0 (0x00000000)
```
qiaojbao pushed a commit that referenced this pull request Jan 26, 2024
This PR adds support for thread names in lldb on Windows.

```
(lldb) thr list
Process 2960 stopped
  thread llvm#53: tid = 0x03a0, 0x00007ff84582db34 ntdll.dll`NtWaitForMultipleObjects + 20
  thread llvm#29: tid = 0x04ec, 0x00007ff845830a14 ntdll.dll`NtWaitForAlertByThreadId + 20, name = 'SPUW.6'
  thread llvm#89: tid = 0x057c, 0x00007ff845830a14 ntdll.dll`NtWaitForAlertByThreadId + 20, name = 'PPU[0x1000019] physics[main]'
  thread #3: tid = 0x0648, 0x00007ff843c2cafe combase.dll`InternalDoATClassCreate + 39518
  thread llvm#93: tid = 0x0688, 0x00007ff845830a14 ntdll.dll`NtWaitForAlertByThreadId + 20, name = 'PPU[0x100501d] uMovie::StreamingThread'
  thread #1: tid = 0x087c, 0x00007ff842e7a104 win32u.dll`NtUserMsgWaitForMultipleObjectsEx + 20
  thread llvm#96: tid = 0x0890, 0x00007ff845830a14 ntdll.dll`NtWaitForAlertByThreadId + 20, name = 'PPU[0x1002020] HLE Video Decoder'
<...>
```
qiaojbao pushed a commit that referenced this pull request Jan 26, 2024
When shl is folded in compare instruction, a miscompilation occurs when
the CMP instruction is also sign-extended. For the following IR:

  %op3 = shl i8 %op2, 3
  %tmp3 = icmp eq i8 %tmp2, %op3

It used to generate

   cmp w8, w9, sxtb #3

which means sign extend w9, shift left by 3, and then compare with the
value in w8. However, the original intention of the IR would require
`%op2` to first shift left before extending the operands in the
comparison operation . Moreover, if sign extension is used instead of
zero extension, the sample test would miscompile. This PR creates a fix
for the issue, more specifically to not fold the left shift into the CMP
instruction, and to create a zero-extended value rather than a
sign-extended value.
qiaojbao pushed a commit that referenced this pull request Mar 21, 2024
…lvm#80904)"

This reverts commit b1ac052.

This commit breaks coroutine splitting for non-swift calling convention
functions. In this example:

```ll
; ModuleID = 'repro.ll'
source_filename = "stdlib/test/runtime/test_llcl.mojo"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @craSH to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 }

define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 {
  ret void
}

define dso_local void @craSH(ptr %0) #0 {
  %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0)
  %3 = call ptr @llvm.coro.begin(token %2, ptr null)
  %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0
  %5 = call ptr @llvm.coro.async.resume()
  store ptr %5, ptr %4, align 8
  %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison)
  ret void
}

define dso_local ptr @ctxt_proj_fn(ptr %0) #0 {
  ret ptr %0
}

; Function Attrs: nomerge nounwind
declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1

; Function Attrs: nounwind
declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2

; Function Attrs: nounwind
declare ptr @llvm.coro.begin(token, ptr writeonly) #2

; Function Attrs: nomerge nounwind
declare ptr @llvm.coro.async.resume() #1

attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" }
attributes #1 = { nomerge nounwind }
attributes #2 = { nounwind }
```

This verifier crashes after the `coro-split` pass with

```
cannot guarantee tail call due to mismatched parameter counts
  musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison)
LLVM ERROR: Broken function
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: opt ../../../reduced.ll -O0
 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28
 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22
 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20
 #3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36
 llvm#4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 llvm#5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 llvm#6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 llvm#7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10
 llvm#8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 llvm#9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7
 llvm#10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40
 llvm#11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23
 llvm#12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69
 llvm#13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28
 llvm#14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70
 llvm#15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32
 llvm#16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41
 llvm#17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24
 llvm#18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7
 llvm#19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7
 llvm#20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7
 llvm#21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14
 llvm#22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12
 llvm#23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13
 llvm#24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 llvm#25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12
 llvm#26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 llvm#27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18
 llvm#28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 llvm#29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 llvm#30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74
 llvm#31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3
 llvm#32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20
 llvm#33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33
 llvm#34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17
 llvm#35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7
 llvm#36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7
 llvm#37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7
 llvm#38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10
 llvm#39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27
 llvm#40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33
 llvm#41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
 llvm#42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20
 llvm#43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
 llvm#44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335)
Aborted (core dumped)
qiaojbao pushed a commit that referenced this pull request Jun 5, 2024
...which caused issues like

> ==42==ERROR: AddressSanitizer failed to deallocate 0x32 (50) bytes at
address 0x117e0000 (error code: 28)
> ==42==Cannot dump memory map on emscriptenAddressSanitizer: CHECK
failed: sanitizer_common.cpp:81 "((0 && "unable to unmmap")) != (0)"
(0x0, 0x0) (tid=288045824)
> #0 0x14f73b0c in __asan::CheckUnwind()+0x14f73b0c
(this.program+0x14f73b0c)
> #1 0x14f8a3c2 in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)+0x14f8a3c2
(this.program+0x14f8a3c2)
> #2 0x14f7d6e1 in __sanitizer::ReportMunmapFailureAndDie(void*,
unsigned long, int, bool)+0x14f7d6e1 (this.program+0x14f7d6e1)
> #3 0x14f81fbd in __sanitizer::UnmapOrDie(void*, unsigned
long)+0x14f81fbd (this.program+0x14f81fbd)
> llvm#4 0x14f875df in __sanitizer::SuppressionContext::ParseFromFile(char
const*)+0x14f875df (this.program+0x14f875df)
> llvm#5 0x14f74eab in __asan::InitializeSuppressions()+0x14f74eab
(this.program+0x14f74eab)
> llvm#6 0x14f73a1a in __asan::AsanInitInternal()+0x14f73a1a
(this.program+0x14f73a1a)

when trying to use an ASan suppressions file under Emscripten: Even
though it would be considered OK by SUSv4, the Emscripten runtime states
"We don't support partial munmapping" (see

<emscripten-core/emscripten@f4115eb>
"Implement MAP_ANONYMOUS on top of malloc in STANDALONE_WASM mode
(llvm#16289)").

Co-authored-by: Stephan Bergmann <stephan.bergmann@allotropia.de>
qiaojbao pushed a commit that referenced this pull request Jun 5, 2024
…erSize (llvm#67657)"

This reverts commit f0b3654.

This commit triggers UB by reading an uninitialized variable.

`UP.PartialThreshold` is used uninitialized in `getUnrollingPreferences()` when
it is called from `LoopVectorizationPlanner::executePlan()`. In this case the
`UP` variable is created on the stack and its fields are not initialized.

```
==8802==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x557c0b081b99 in llvm::BasicTTIImplBase<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h
    #1 0x557c0b07a40c in llvm::TargetTransformInfo::Model<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:2277:17
    #2 0x557c0f5d69ee in llvm::TargetTransformInfo::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) const llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:387:19
    #3 0x557c0e6b96a0 in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7624:7
    llvm#4 0x557c0e6e4b63 in llvm::LoopVectorizePass::processLoop(llvm::Loop*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10253:13
    llvm#5 0x557c0e6f2429 in llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo*, llvm::TargetLibraryInfo*, llvm::DemandedBits&, llvm::AssumptionCache&, llvm::LoopAccessInfoManager&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10344:30
    llvm#6 0x557c0e6f2f97 in llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10383:9

[...]

  Uninitialized value was created by an allocation of 'UP' in the stack frame
    #0 0x557c0e6b961e in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7623:3
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants