[X86] regenerate fcopysign test checks (#183710)#4
Closed
ampandey-1995 wants to merge 5919 commits into
Closed
Conversation
## Implement Lowering for Fortran 2023 Conditional Expressions (R1002) ***This PR contains the lowering steps only for ease of review. DO NOT MERGE until llvm#186489 is merged.*** Implements Fortran 2023 conditional expressions with syntax: `result = (condition ? value1 : condition2 ? value2 : ... : elseValue)` Issue: llvm#176999 Discourse: https://discourse.llvm.org/t/rfc-adding-conditional-expressions-in-flang-f2023/89869/1 -- note that some of the details provided in the RFC post are no longer accurate ### Implementation Details **Lowering to HLFIR:** - Lazy evaluation via nested if-then-else control flow - Only the selected branch is evaluated - Temporary allocation with proper cleanup - Special handling for: - CHARACTER types with deferred length - Arrays (shape determined by selected branch per F2023 10.1.4(7)) - Derived types **LIT Testing:** - Lowering tests: HLFIR code generation verification - Note: Executable tests will be added to the llvm-test-suite repo (llvm/llvm-test-suite#369) **Limitations** - Conditional arguments are not yet supported. This work is planned - llvm#180592 - Polymorphic types (CLASS) not yet supported in lowering - Both limitations will emit clear error message if encountered ### Examples ``` ! Simple conditional x = (flag ? 10 : 20) ! Chained result = (x > 0 ? 1 : x < 0 ? -1 : 0) ! Examples from F2023 ( ABS (RESIDUAL)<=TOLERANCE ? ’ok’ : ’did not converge’ ) ( I>0 .AND. I<=SIZE (A) ? A (I) : PRESENT (VAL) ? VAL : 0.0 ) ``` AI Usage Disclosure: AI tools (Claude Sonnet 4.5) were used to assist with implementation of this feature and test code generation. I have reviewed, modified, and tested all AI-generated code.
…#188961) When MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS is enabled, the greedy driver verifies the IR after each pattern application. The specialize passes failed because ACCOpReplaceWithVarConversion would run on a data entry op (e.g. acc.create) before container ops that use it in their dataOperands were processed. After replacement, the container op held a non-data-entry operand (e.g. a func arg), failing the acc dialect's dataOperands verifier. Fix: in ACCOpReplaceWithVarConversion, defer by returning failure() when any user of the data entry op's result is a container op that validates its operands as data entry ops (acc.data, acc.parallel, acc.serial, acc.kernels, acc.host_data, acc.kernel_environment, acc.declare_enter, acc.enter_data). The greedy driver will process the container op first (via ACCRegionUnwrapConversion or ACCDeclareEnterOpConversion), removing the use, after which the data entry op can be safely replaced. Assisted-by: Claude Code Fix a failure present with MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON.
Follow up to llvm#190827 Assisted-by: codex
…m#191265) The bcdtor mode affects how the AIX linker choose to pull in static constructors and destructors (https://www.ibm.com/docs/en/aix/7.2.0?topic=l-ld-command) to the link. The current setting of `all` makes static init in archive members live regardless of if the archive member would be otherwise referenced, causing that whole archive member to become part of the link. This default was initially retained for compatibility purposes with historical compilers on the platform which defaulted to this setting. Unfortunately this greedy pulling in of static init can have unintended consequences for applications, for example for programs linked against parts of compiler-rt which contain optional instrumentation (containing static initializers) which may be unused as these now become live in all programs regardless of use. For that reason and similar reasons, this PR switches the default to `mbr`, which only extracts static init from archive members which would otherwise be referenced. This gives a behaviour very consistent with linkers on other platforms (e.g. Linux). Users requiring the old default behaviour can manually pass `-bcdtors:all` on the link step which will override any default we pass here.
(llvm#191786) Fixes CFG construction for default member initializers when `AddCXXDefaultInitExprInCtors` is enabled by correcting the execution order of cleanups. E.g., in ```cpp struct H { std::string_view v = std::string("x"); H() {} }; ``` Previously, destructors for temporaries in default initializers for`std::string("x")` was sequenced _before_ the member initialization, causing false negatives in lifetime safety analysis because the temporary appeared to be destroyed prematurely before making to a origin. Resolved this by modifying `CFGBuilder::addInitializer` to defer these cleanups to the end of the initialization full-expression. _(AI-assisted with HITL)_
This PR improves CodeGenHLSL tests for QuadReadAcrossX. It should cover all supported types along with 16-bit types. Using regex captures to simplify writing checks for subsequent test cases.
This header assumed SmallVector would be included before it
https://reviews.llvm.org/D124669 added -fno-integrated-as driver option but not the fc1 option. As a result, the backend kept MCAsmInfo::useIntegratedAssembler() set and emitted LLVM-only directives such as `.prefalign` (llvm#155529), which GNU as rejects: ``` a.s: Assembler messages: a.s: Error: unknown pseudo-op: `.prefalign' ``` Follow clang and introduce fc1 -no-integrated-as to set `CodeGenOpts.DisableIntegratedAS` and llvm::TargetOptions::DisableIntegratedAS.
…Index` (llvm#189498) This patch fixes `hasNonUniformIndex` search so that it accounts for any path that connects nuri to index access to return true fix: llvm#189438 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>
As another step in issue llvm#135812, this patch fixes block frequencies when LoopUnroll converts a conditional latch in an unrolled loop iteration to unconditional. It thus includes complete loop unrolling (the conditional backedge becomes an unconditional loop exit), which might be applied to the original loop or to its remainder loop. As explained in detail in the header comments on the fixProbContradiction function that this patch introduces, these conversions mean LoopUnroll has proven that the original uniform latch probability is incorrect for the original loop iterations associated with the converted latches. However, LoopUnroll often is able to perform these corrections for only some iterations, leaving other iterations with the original latch probability, and thus corrupting the aggregate effect on the total frequency of the original loop body. This patch ensures that the total frequency of the original loop body, summed across all its occurrences in the unrolled loop after the aforementioned conversions, is the same as in the original loop. Unlike other patches in this series, this patch cannot derive the required latch probabilities directly from the original uniform latch probability because it has been proven incorrect for some original loop iterations. Instead, this patch computes entirely new probabilities for the remaining N conditional latches in the unrolled loop. This patch only handles N <= 2, for which it uses simple formulas to compute a single uniform probability across the latches. Future patches will handle N > 2. This patch series does not consider the presence of non-latch loop exits, and I do not have a solid plan for that case. See fixme comments this patch introduces. This patch depends on PR llvm#182403 and PR llvm#191008.
…s for any rank (llvm#188983) The fold for `vector.multi_reduction` only handled the rank-1 case with no reduction dimensions. For higher-rank vectors (e.g., `vector<2x3xf32>`) with empty reduction dims `[]`, the fold returned null, allowing `ElideUnitDimsInMultiDimReduction` to fire incorrectly. That canonicalization pattern checks that all *reduced* dims have size 1, but with zero reduction dims the check trivially passes, and the pattern then computes `acc op source` (e.g., `acc + source`) instead of the correct no-op result (`source`). This caused `--canonicalize` to produce a different value than `--lower-vector-multi-reduction` for the same program: vector.mask %m { vector.multi_reduction <add>, %src, %src [] : vector<3x3xi32> to vector<3x3xi32> } : vector<3x3xi1> -> vector<3x3xi32> * Without --lower-vector-multi-reduction: `src + src` (e.g., 2) * With --lower-vector-multi-reduction: `src` (e.g., 1) Fix the fold to return `source` for any rank when `reduction_dims` is empty. This makes the empty-dims case consistent: the operation is a noop regardless of rank, and `ElideUnitDimsInMultiDimReduction` no longer gets a chance to mishandle it. Fixes llvm#129415 Assisted-by: Claude Code
…llvm#191756) The inner CONCAT_VECTORS result type was hardcoded to MVT::v8i1, which is only correct when BitBytes == 1. Otherwise, the inner concat produces fewer elements than 8, causing an assertion failure: Assertion `(Ops[0].getValueType().getVectorElementCount() * Ops.size()) == VT.getVectorElementCount() && "Incorrect element count in vector concatenation!"' failed. Fix by computing the inner vector type dynamically based on BitBytes.
…part 42) (llvm#191751) Tests converted from test/Lower/Intrinsics: storage_size.f90, sum.f90, system_clock.f90, trailz.f90, transfer.f90
…lvm#189241) A new MoveLastSplitAxisPattern class handles the case where the last grid axis of one tensor dimension is moved to the front of another tensor dimension's split axes, e.g. [[0, 1], [2]] -> [[0], [1, 2]]. The three bugs fixed are: 1. detectMoveLastSplitAxisInResharding: compared source.back() with target.back() instead of target.front(), preventing the pattern from being detected for resharding like [[0,1],[2]] -> [[0],[1,2]]. 2. targetShardingInMoveLastAxis: axes were appended with push_back but should be inserted at the front, producing wrong split_axes order. 3. handlePartialAxesDuringResharding: a copy_if wrote results into the wrong output variable (addressed structurally by the clean implementation). Fixes llvm#136117 Assisted-by: Claude Code
…mMemory." (llvm#191851) Reverts llvm#190681 due to buildbot breakage (llvm#190681 (comment)).
Reverts llvm#191561 This is not required anymore.
…89000) When tiling a rank-0 linalg.generic op, tileUsingSCF returns an empty loops vector (rank-0 ops have no parallel dimensions and produce no scf.forall). Two call sites unconditionally accessed tilingResult.loops.front(), causing a crash: - tileToForallOpImpl: the loop normalization block was entered whenever mixedNumThreads was empty, regardless of whether any loops exist. Guard it with \!tilingResult.loops.empty(). - TileUsingForallOp::apply: tileOps.push_back was called unconditionally. Guard it with \!tilingResult.loops.empty(). Add regression tests for both the tile_sizes and num_threads paths, verifying that the linalg.generic is preserved and no scf.forall is emitted. Fixes llvm#187073 Assisted-by: Claude Code
This patch re-enables unicode tests on Windows by improving the `Terminal::SupportsUnicode` check. Checking that the stdout handle is a `FILE_TYPE_CHAR` is a better heuristic than always returning true, which assumed we were always using a terminal and never piping the output.
…1800… (llvm#191835) an issue reported with this patch llvm#191241. Revert for now and reenable later This reverts commit e71da01.
This PR fixes a crash due to a failed assertion in the `from_python` implementations of the type casters. The assertion obviously only triggers if assertions are enabled, which isn't the case for many Python installations, *and* if a Python capsule of the wrong type is attempted to be used, so this this isn't triggered easily. The problem is that the conversion from Python capsules may set the Python error indicator but the callers of the type casters do not expect that. In fact, if there are several operloads of a function, the first may cause the error indicator to be set and the second runs into the assertion. The fix is to unset the error indicator after a failed capsule conversion, which is indicated with the return value of the function anyways. In alternative fix would be to unset the error indicator *inside* the `mlirPythonCapsuleTo*` functions; however, their documentations does say that the Python error indicator is set, so I assume that some callers may *want* to see the indicator and that the responsibility to handle it is on them. Signed-off-by: Ingo Müller <ingomueller@google.com>
llvm#191773) The old name was misleading because this function is not specific to unary ops suggested in llvm#189099 (comment)
…#191493) Follow-up to llvm#188113 per @erichkeane's feedback: `isFundamentalIntType` and `isFundamental()` should not disagree. The previous patch added `!isBitInt()` only inside `IntType::isFundamental()`, leaving the underlying TableGen predicates (`CIR_AnyFundamentalIntType` etc.) unaware of `_BitInt`. That meant `isSignedFundamental()` and `isUnsignedFundamental()` were silently wrong — a `_BitInt(32)` would pass them. This patch adds a `CIR_IsNotBitIntPred` to the three fundamental-int constraint defs so everything stays consistent. `isFundamental()` now just forwards to `isFundamentalIntType()` with no extra logic. Includes an `invalid-bitint.cir` test that checks a `_BitInt(32)` is rejected where a fundamental unsigned int is required. Made with [Cursor](https://cursor.com)
Add CIR-to-LLVM and classic codegen RUN lines to empty.cpp, c89-implicit-int.c, expressions.cpp, binop.c, forward-enum.c, and static-vars.c so each test verifies LLVM IR output from both pipelines. Made with [Cursor](https://cursor.com)
) PR llvm#181071 caused regressions on Linux on Arm. These are being tracked in: - llvm#191855 - llvm#191859 This PR disables the failing tests for now, to fix the broken buildbot.
MachOPlatform::HeaderOptions now includes an optional UUID field. If set, this will be used to build an LC_UUID load command for the JITDylib's MachO header. No testcase: MachOPlatform construction requires the ORC runtime, which we can't require in LLVM regression or unit tests. In the future we should test this through the ORC runtime.
…llvm#191875) Depending on the case, SLP either misses optimizing re-vectorized runtime strided loads (and use a gather instead) or produces the incorrect strided load.
…nclude in `UniqueBBID.h` (llvm#191877) The modules build of LLVM broke when this patch landed ``` commit 2f422a5 Author: Rahman Lavaee <rahmanl@google.com> Date: Fri Apr 10 15:58:16 2026 -0700 [Codegen, X86] Add prefetch insertion based on Propeller profile (llvm#166324) ``` with an error like: ``` [2026-04-11T10:33:41.699Z] While building module 'LLVM_Utils' imported from /Users/ec2-user/jenkins/workspace/m.org_clang-stage2-Rthinlto_main/llvm-project/llvm/lib/Demangle/Demangle.cpp:13: [2026-04-11T10:33:41.699Z] In file included from <module-includes>:321: [2026-04-11T10:33:41.699Z] /Users/ec2-user/jenkins/workspace/m.org_clang-stage2-Rthinlto_main/llvm-project/llvm/include/llvm/Support/UniqueBBID.h:40:3: error: missing '#include "llvm/ADT/StringRef.h"'; 'StringRef' must be declared before it is used [2026-04-11T10:33:41.699Z] 40 | StringRef TargetFunction; [2026-04-11T10:33:41.699Z] | ^ [2026-04-11T10:33:41.699Z] /Users/ec2-user/jenkins/workspace/m.org_clang-stage2-Rthinlto_main/llvm-project/llvm/include/llvm/ADT/StringRef.h:55:24: note: declaration here is not visible [2026-04-11T10:33:41.699Z] 55 | class LLVM_GSL_POINTER StringRef { [2026-04-11T10:33:41.699Z] | ^ [2026-04-11T10:33:41.699Z] /Users/ec2-user/jenkins/workspace/m.org_clang-stage2-Rthinlto_main/llvm-project/llvm/lib/Demangle/Demangle.cpp:13:10: fatal error: could not build module 'LLVM_Utils' [2026-04-11T10:33:41.699Z] 13 | #include "llvm/Demangle/Demangle.h" [2026-04-11T10:33:41.699Z] | ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ ``` https://ci.swift.org/job/llvm.org/job/clang-stage2-Rthinlto/job/main/150/ This patch tries to fix that by adding the missing include. rdar://174555346
The Hexagon driver's constructHexagonLinkArgs() was not calling addLTOOptions(). This meant that LTO plugin options weren't forwarded to the linker. This caused a crash when using ThinLTO with -fenable-matrix on llvm-test-suite/SingleSource/UnitTests/matrix-types-spec.cpp: LowerMatrixIntrinsicsPass did not run in the LTO backend because -enable-matrix was not forwarded via -plugin-opt. Add the addLTOOptions() call to both the musl and bare-metal code paths in constructHexagonLinkArgs().
… changes to libcxx) (llvm#191681) Further fix for llvm#187788. Previous attempt in PR llvm#188044 only updated the model and model tests, but forgot to update the registered matcher.
…1761) addFlagsUsingAttrFn is hot and showing up in compile-time profiles via llvm::CallLowering::lowerCall. The culprit is std::function callback. Switching to set flags based on AttributeSet directly is a -0.25% compile-time improvement on CTMark AArch64 O0. https://llvm-compile-time-tracker.com/compare.php?from=d35cd21a3757ab6028024f0b47bc9d802d06eae6&to=e717c7017faf2cb386f0d02715fb55d252b3ae42&stat=instructions%3Au
This relates to llvm#35980. Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
…lvm#191450) Enable the close_range syscall on Linux when __NR_close_range is available in kernel headers (Linux 5.9+). On older kernels, the syscall returns ENOSYS and callers fall back gracefully. This fixes the slow FD closing loop in StartSubprocess when RLIMIT_NOFILE is high (e.g. 1B in Docker environments). Fixes llvm#63297. Fixes llvm#152459.
…ind (llvm#191922) Related discussion in: llvm#186946 (comment)
…vm#192110) The argument to `AsCString` was made explicit in 116b045. ``` /home/ewilde/llvm-project/freebsd-lldb-build/lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp:754:48: error: too few arguments to function call, single argument 'value_if_empty' was not specified 754 | module_file_spec.GetFilename().AsCString()); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ /home/ewilde/llvm-project/freebsd-lldb-build/lldb/include/lldb/Utility/ConstString.h:183:15: note: 'AsCString' declared here 183 | const char *AsCString(const char *value_if_empty) const { | ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. ``` Not all use-sites were updated to pass an argument resulting in build failures. I'm updating the errors in the FreeBSD and NetBSD plugins to use formatv instead of expanding the C String, like what is done on Linux, avoiding the issue entirely. rdar://174675042
The HTTP implementation depends on CURL and would preferably not be part of the LLVM dylib. This was not possible as a nested library under libSupport, because libSupport itself is part of the LLVM dylib. This patch moves the HTTP code into a separate top-level library that is independent from libSupport and excluded from the LLVM dylib.
) Fixes: 9116344 (llvm#188692) Assisted-by: Claude --------- Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>
These two values ensure that CPU was in kernel privilege at the time of crash. This change is from KGDB's `amd64fbsd-kern.c`. Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>
…uno [ C | y ]` (llvm#185844) Recognize TWO new patterns and fold them as follows: ``` fptrunc(x) ord/uno C --> x ord/uno 0 fptrunc(x) ord/uno fptrunc(y) --> x ord/uno y ``` Fixes llvm#185698 Alive2: https://alive2.llvm.org/ce/z/YvXnBJ IR diff: dtcxzyw/llvm-opt-benchmark#3551 CompTime impact: dtcxzyw/llvm-opt-benchmark#3552
The function to lower wave reduce pseudos is already quite large ,and there are yet a few more operations to support. Refactoring some of the code to make it more manageable. Summary of changes: 1. Moved the expansion for `V_CNDMASK_B64_PSEUDO` to a separate function. It's needed for 64 bit dpp operations. 2. Collapsed `getIdentityValueFor32BitWaveReduction` and `getIdentityValueFor64BitWaveReduction` into a single function which returns a 64 bit unsigned value. 3. Modified `getDPPOpcForWaveReduction` to also return the `Clamp` opcode. 4. Added a lambda: `BuildRegSequence` and a static function `ExtractSubRegs` as those code blocks are repeated with little variation. 5. Moved logic for setting identity value in inactive lanes to `BuildSetInactiveInstr`.
Several tests seemed to require asserts despite not testing any debug output so I have removed the line.
Supported Ops: `min`, `max`, `umin`, `umax`
Supported Ops: `add`, `sub`
Supported Ops: `and`, `or`, `xor`
Supported Ops: `fmin` and `fmax`
Supported Ops: `fadd` and `fsub`
…191804) OpenCL any()/all() builtins receive integer vectors, but OpAny/OpAll require boolean vector inputs per the SPIR-V spec related to llvm#190736
Add initial AddressSanitizer support for AMDGPU by intercepting HSA APIs used for host-side initialization and CPU↔GPU memory operations (allocation, copies, IPC, vmem, and other related queries). This support is guarded by `SANITIZER_AMDGPU`, which enables inclusion of the ROCm HSA and COMgr headers required to build the interceptors.
ampandey-1995
pushed a commit
that referenced
this pull request
Apr 15, 2026
Running gcc test c-c++-common/tsan/tls_race.c on s390 we get: ThreadSanitizer: CHECK failed: tsan_platform_linux.cpp:618 "((thr_beg)) >= ((tls_addr))" (0x3ffaa35e140, 0x3ffaa35e250) (tid=2419930) #0 __tsan::CheckUnwind() /devel/src/libsanitizer/tsan/tsan_rtl.cpp:696 (libtsan.so.2+0x91b57) #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /devel/src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 (libtsan.so.2+0xd211b) #2 __tsan::ImitateTlsWrite(__tsan::ThreadState*, unsigned long, unsigned long) /devel/src/libsanitizer/tsan/tsan_platform_linux.cpp:618 (libtsan.so.2+0x8faa3) #3 __tsan::ThreadStart(__tsan::ThreadState*, unsigned int, unsigned long long, __sanitizer::ThreadType) /devel/src/libsanitizer/tsan/tsan_rtl_thread.cpp:225 (libtsan.so.2+0xaadb5) #4 __tsan_thread_start_func /devel/src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1065 (libtsan.so.2+0x3d34d) #5 start_thread <null> (libc.so.6+0xae70d) (BuildId: d3b08de1b543c2d15d419bf861b3c2e4c01ac75b) #6 thread_start <null> (libc.so.6+0x12d2ff) (BuildId: d3b08de1b543c2d15d419bf861b3c2e4c01ac75b) In order to determine the static TLS blocks in GetStaticTlsBoundary we iterate over the modules and try to find the largest range without a gap. Here we might have that modules are spaced exactly by the alignment. For example, for the failing test we have: (gdb) p/x ranges.data_[0] $1 = {begin = 0x3fff7f9e6b8, end = 0x3fff7f9e740, align = 0x8, tls_modid = 0x3} (gdb) p/x ranges.data_[1] $2 = {begin = 0x3fff7f9e740, end = 0x3fff7f9eed0, align = 0x40, tls_modid = 0x2} (gdb) p/x ranges.data_[2] $3 = {begin = 0x3fff7f9eed8, end = 0x3fff7f9eef8, align = 0x8, tls_modid = 0x4} (gdb) p/x ranges.data_[3] $4 = {begin = 0x3fff7f9eefc, end = 0x3fff7f9ef00, align = 0x4, tls_modid = 0x1} where ranges[3].begin == ranges[2].end + ranges[3].align holds. Since in the loop a strict inequality test is used we compute the wrong address (gdb) p/x *addr $5 = 0x3fff7f9eefc whereas 0x3fff7f9e6b8 is expected which is why we bail out in the subsequent.
ampandey-1995
pushed a commit
that referenced
this pull request
Apr 22, 2026
…bols add' (llvm#188377) Context: lldb might crash when running to a debuggee crashing state and do a target symbols add command. Backtrace: ``` #0 0x000055ca6790dc65 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:848:11 #1 0x000055ca6790e434 PrintStackTraceSignalHandler(void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:931:1 #2 0x000055ca6790b839 llvm::sys::RunSignalHandlers() /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Signals.cpp:104:5 #3 0x000055ca6790ff6b SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:430:38 #4 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 #5 0x00007fe9e5f25649 syscall /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/misc/../sysdeps/unix/sysv/linux/x86_64/syscall.S:38:0 #6 0x00007fe9ec649170 SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:429:7 #7 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 #8 0x00007fe9ebb77bf0 lldb_private::operator<(lldb_private::StackID const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackID.cpp:99:16 llvm#9 0x00007fe9ebb6863d CompareStackID(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:683:3 llvm#10 0x00007fe9ebb6d049 bool __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>::operator()<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/predefined_ops.h:196:4 llvm#11 0x00007fe9ebb6cefe __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::__lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algobase.h:1464:8 llvm#12 0x00007fe9ebb6cdfc __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algo.h:2062:14 llvm#13 0x00007fe9ebb685fa auto llvm::lower_bound<std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/STLExtras.h:2001:10 llvm#14 0x00007fe9ebb68441 lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:697:11 llvm#15 0x00007fe9ebbee395 lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/include/lldb/Target/Thread.h:459:7 llvm#16 0x00007fe9ebac7cf7 lldb_private::ExecutionContextRef::GetFrameSP() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:643:25 llvm#17 0x00007fe9ebac80e1 lldb_private::GetStoppedExecutionContext(lldb_private::ExecutionContextRef const*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:164:34 llvm#18 0x00007fe9eb8903fa lldb_private::Statusline::Redraw(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Statusline.cpp:139:7 llvm#19 0x00007fe9eb7ac8be lldb_private::Debugger::RedrawStatusline(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1233:3 llvm#20 0x00007fe9eb804d1e lldb_private::IOHandlerEditline::RedrawCallback() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:446:3 llvm#21 0x00007fe9eb80aa81 lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2::operator()() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:262:73 llvm#22 0x00007fe9eb80aa5d void llvm::detail::UniqueFunctionBase<void>::CallImpl<lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2>(void*) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:213:5 llvm#23 0x00007fe9eb93bfbf llvm::unique_function<void ()>::operator()() /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:365:5 llvm#24 0x00007fe9eb93bb80 lldb_private::Editline::GetCharacter(wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:0:5 llvm#25 0x00007fe9eb941a18 lldb_private::Editline::ConfigureEditor(bool)::$_0::operator()(editline*, wchar_t*) const /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1287:5 llvm#26 0x00007fe9eb9419e2 lldb_private::Editline::ConfigureEditor(bool)::$_0::__invoke(editline*, wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1286:27 llvm#27 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:439:14 llvm#28 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:400:1 llvm#29 0x00007fe9f3384f90 read_getcmd /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:247:14 llvm#30 0x00007fe9f3384f90 el_gets /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:586:14 llvm#31 0x00007fe9eb9409f3 lldb_private::Editline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1636:16 llvm#32 0x00007fe9eb8044d7 lldb_private::IOHandlerEditline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:339:5 llvm#33 0x00007fe9eb805609 lldb_private::IOHandlerEditline::Run() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:600:11 llvm#34 0x00007fe9eb7b214c lldb_private::Debugger::RunIOHandlers() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1280:16 llvm#35 0x00007fe9eb98f00f lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:3620:16 llvm#36 0x00007fe9eb4f0e09 lldb::SBDebugger::RunCommandInterpreter(bool, bool) /home/hyubo/osmeta/external/llvm-project/lldb/source/API/SBDebugger.cpp:1234:42 llvm#37 0x000055ca6788d6b0 Driver::MainLoop() /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:677:3 llvm#38 0x000055ca6788e226 main /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:887:17 llvm#39 0x00007fe9e5e2c657 __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16 llvm#40 0x00007fe9e5e2c718 call_init /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:128:20 llvm#41 0x00007fe9e5e2c718 __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:379:5 llvm#42 0x000055ca67889a11 _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:118:0 Segmentation fault (core dumped) ``` When `target symbols add` is run, `Symtab::AddSymbol()` can reallocate the underlying `std::vector<Symbol>` and resize it, invalidating all existing Symbol* pointers. While `Process::Flush()` clears stale stack frames, the statusline caches its own `ExecutionContextRef` containing a `StackID` with a `SymbolContextScope*` (which can be a `Symbol*`). This cached reference is not cleared by `Process::Flush()`, so the next statusline redraw accesses a dangling pointer and crashes. Fix this by adding `Statusline::Flush()` which clears the cached frame, `Debugger::Flush()` which forwards to it under the statusline mutex, and calling `Debugger::Flush()` from `Process::Flush()` so that all flush paths (symbol add, exec, module load) also invalidate the statusline's stale state. After this fix, lldb is not crashing anymore, new symbols from a symbol file are correctly loaded --------- Co-authored-by: George Hu <georgehuyubo@gmail.com>
ampandey-1995
pushed a commit
that referenced
this pull request
May 4, 2026
…input" (llvm#195551) Reverts llvm#190863 due to buildbot breakage e.g., https://lab.llvm.org/buildbot/#/builders/52/builds/16951 ``` Failed Tests (1): LLVM :: tools/llvm-profgen/filter-build-id.test ``` ``` ==llvm-profgen==3809550==ERROR: AddressSanitizer: container-overflow on address 0x6e80441e1762 at pc 0x6216c3f2cdce bp 0x7fff3c3ddf60 sp 0x7fff3c3dd710 READ of size 8 at 0x6e80441e1762 thread T0 #0 0x6216c3f2cdcd in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:848:7 #1 0x6216c3f2d25c in bcmp /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:894:10 #2 0x6216c400b836 in operator== /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:914:10 #3 0x6216c400b836 in operator!= /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:917:69 #4 0x6216c400b836 in llvm::sampleprof::PerfScriptReader::extractCallstack(llvm::sampleprof::TraceStream&, llvm::SmallVectorImpl<unsigned long>&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:801:36 #5 0x6216c400d37a in llvm::sampleprof::HybridPerfReader::parseSample(llvm::sampleprof::TraceStream&, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:881:8 #6 0x6216c40150d8 in parseSample /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1118:3 #7 0x6216c40150d8 in llvm::sampleprof::PerfScriptReader::parseEventOrSample(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1201:5 #8 0x6216c401539a in llvm::sampleprof::PerfScriptReader::parseAndAggregateTrace() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1210:5 llvm#9 0x6216c4018c88 in llvm::sampleprof::PerfScriptReader::parsePerfTraces() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1457:3 llvm#10 0x6216c3ff2c7a in main /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/llvm-profgen.cpp:229:19 llvm#11 0x72404502a8c0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a8c0) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#12 0x72404502a9d7 in __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a9d7) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#13 0x6216c3f0f3d4 in _start (/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-profgen+0x2f083d4) 0x6e80441e1762 is located 18 bytes inside of 48-byte region [0x6e80441e1750,0x6e80441e1780) allocated by thread T0 here: #0 0x6216c3feab0d in operator new(unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:109:35 #1 0x724045511c07 in __libcpp_allocate<char> /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__new/allocate.h:42:28 #2 0x724045511c07 in allocate /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator.h:92:14 #3 0x724045511c07 in allocate_at_least /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator.h:99:13 #4 0x724045511c07 in allocate_at_least<std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocator_traits.h:340:22 #5 0x724045511c07 in __allocate_at_least<std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/__memory/allocate_at_least.h:36:16 #6 0x724045511c07 in __allocate_long_buffer /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/string:2259:21 #7 0x724045511c07 in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__grow_by(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/libcxx/include/string:2769:25 #8 0x6216c401d90a in __grow_by_without_replace /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/string:2795:3 llvm#9 0x6216c401d90a in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>& std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::append[abi:sqn230000]<char const*, 0>(char const*, char const*) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/string:1431:9 llvm#10 0x6216c401d1a6 in std::__1::basic_istream<char, std::__1::char_traits<char>>& std::__1::getline[abi:sqn230000]<char, std::__1::char_traits<char>, std::__1::allocator<char>>(std::__1::basic_istream<char, std::__1::char_traits<char>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, char) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/istream:1309:15 llvm#11 0x6216c4014a76 in getline<char, std::__1::char_traits<char>, std::__1::allocator<char> > /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/libcxx_install_asan/include/c++/v1/istream:1343:10 llvm#12 0x6216c4014a76 in advance /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.h:52:10 llvm#13 0x6216c4014a76 in llvm::sampleprof::PerfScriptReader::parseAggregatedCount(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1110:13 llvm#14 0x6216c4015095 in parseSample /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1116:20 llvm#15 0x6216c4015095 in llvm::sampleprof::PerfScriptReader::parseEventOrSample(llvm::sampleprof::TraceStream&) /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1201:5 llvm#16 0x6216c401539a in llvm::sampleprof::PerfScriptReader::parseAndAggregateTrace() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1210:5 llvm#17 0x6216c4018c88 in llvm::sampleprof::PerfScriptReader::parsePerfTraces() /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp:1457:3 llvm#18 0x6216c3ff2c7a in main /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/tools/llvm-profgen/llvm-profgen.cpp:229:19 llvm#19 0x72404502a8c0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a8c0) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#20 0x72404502a9d7 in __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a9d7) (BuildId: ae327f26c123ea1374623c41e676a4bf00e5c1cb) llvm#21 0x6216c3f0f3d4 in _start (/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-profgen+0x2f083d4) ```
ampandey-1995
pushed a commit
that referenced
this pull request
May 21, 2026
llvm#183506 revealed a pre-existing use-after-scope in createInstrInfo (MSan bot: https://lab.llvm.org/buildbot/#/builders/164/builds/21562 [*]). This patch fixes the issue by changing the stack-allocated AArch64Subtarget (which goes out of scope once createInstrInfo() returns) into heap-allocated, allowing it to be safely stored in the returned AArch64InstrInfo. ----- [*] WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55555666fabd in llvm::AArch64InstrInfo::getInstSizeInBytes(llvm::MachineInstr const&) const /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:247:5 ... /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:85:3 llvm#9 0x555556508559 in InstSizes_MOVaddrTagged_Test::TestBody() /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:301:3 ... Member fields were destroyed #0 0x555556498a1d in __sanitizer_dtor_callback_fields /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1074:5 #1 0x5555564fbda6 in ~Triple /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/TargetParser/Triple.h:348:12 #2 0x5555564fbda6 in ~Triple /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/TargetParser/Triple.h:47:7 #3 0x5555564fbda6 in llvm::AArch64Subtarget::~AArch64Subtarget() /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64Subtarget.h:38:7 #4 0x555556503396 in (anonymous namespace)::createInstrInfo(llvm::TargetMachine*) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:38:1 #5 0x5555565084cb in InstSizes_MOVaddrTagged_Test::TestBody() /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:299:42
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[X86] regenerate fcopysign test checks (llvm#183710)
Fix vpternlog comments
[AMDGPU] Support i8/i16 GEP indices when promoting allocas to vectors (llvm#175489)
Allow promote alloca to vector to form a vector element index from
i8/i16
GEPs when the dynamic offset is known to be element size aligned.
Example:
Or:
[lldb][test] Re-enable TestDyldLaunchLinux.py for Linux/Arm (llvm#181221)
The test was disabled in c55e021, but it now passes, with both remote
and local runs.
[MemorySSA] Make
getBlockDefsandgetBlockAccessesreturn a non-const list (NFC)As per discussion at llvm#181709 (comment),
users may already get a non-const MemoryAccess pointer via
getMemoryAccessfor a given instruction. Drop the restrictionon directly iterate over them by modifying public
getBlockDefs/getBlockAccessesAPIs to return a mutable list, thus dropping thenow obsolete distinction with
getWritableBlockDefsandgetWritableBlockAccesseshelpers.[AMDGPU] Remove unused CmpLGOp instruction (llvm#180195)
The instruction was accidentally added, remove it.
Rename OrN2Op to OrN2Opc for consistency with other names
[SCEV] Always return true for isKnownToBeAPowerOfTwo for SCEVVScale (llvm#183693)
After llvm#183080 vscale is always a power of two, so we don't need to check
for the vscale_range attribute.
[mlir][affine] Fix crash in linearize_index fold when basis is ub.poison (llvm#183650)
foldCstValueToCstAttrBasisiterates the folded dynamic basis valuesand erases any operand whose folded attribute is non-null (i.e., was
constant- folded). When an operand folds to
ub.PoisonAttr, theattribute is non-null so the operand was erased from the dynamic operand
list. However,
getConstantIntValueon the correspondingOpFoldResultin
mixedBasisreturnsstd::nulloptfor poison (it is not an integerconstant), so the position was left as
ShapedType::kDynamicin thereturned static basis.
This left the op in an inconsistent state: the static basis claimed one
more dynamic entry than actually existed. A subsequent call to
getMixedBasis()triggered the assertion insidegetMixedValues.Fix by skipping poison attributes in the erasure loop, treating them
like non-constant values. This keeps the dynamic operand and its
matching
kDynamicentry in the static basis consistent.Fixes llvm#179265
[VPlan] Remove non-power-of-2 scalable VF comment. NFC (llvm#183719)
No longer holds after llvm#183080
[mlir][dataflow] Fix crash in IntegerRangeAnalysis with non-constant loop bounds (llvm#183660)
When visiting non-control-flow arguments of a LoopLikeOpInterface op,
IntegerRangeAnalysis assumed that getLoopLowerBounds(),
getLoopUpperBounds(), and getLoopSteps() always return non-null values
when getLoopInductionVars() is non-null. This assumption is incorrect:
for example, AffineForOp returns nullopt from getLoopUpperBounds() when
the upper bound is not a constant affine expression (e.g., a dynamic
index from a tensor.dim).
Fix this by checking whether the bound optionals are engaged before
dereferencing them and falling back to the generic analysis if any bound
is unavailable.
Fixes llvm#180312
[flang] Use CHECK-DAG to check constants (NFC) (llvm#183687)
It is part of llvm#180556, as a
separate NFC PR
[LangRef] Clarify in vscale_range that vscale is a power-of-two without the attribute (llvm#183689)
Previously vscale_range used to add the constraint that vscale is a
power-of-two, but after llvm#183080 it's already a power-of-two to begin
with.
This clarifies the sentence about assumptions when there is no attribute
[Clang] support C23 constexpr struct member access in constant expressions (llvm#182770)
Fixes llvm#178349
This patch resolves an issue where accessing C23
constexprstructmembers using the dot operator was not recognized as a constant
expression.
According to C23 spec:
[NFC][SPIRV] Remove dead code from
SPIRVPostLegalizer.cpp(llvm#183585)Both
visitfunctions are unused in the file and outside of it.[VPlan] Add nuw to unrolled canonical IVs (llvm#183716)
After llvm#183080, the canonical IV (not the increment!) can't overflow. So
now canonical IVs that are unrolled will have steps that don't overflow,
so we can add the nuw flag.
This allows us to tighten the VPlanVerifier isKnownMonotonic check by
restricting it to adds with nuw.
[LLVM][ExecutionEngine] Add vector ConstantInt/FP support to getConstantValue(). (llvm#182538)
Unify vector constant handling via calls to getAggregateElement rather
than handling each constant type separately.
[flang][OpenMP] Add
is_range<R>trait to detect classes with begin/end, NFC (llvm#183615)[analyzer] Fix crash in MallocChecker when a function has both ownership_returns and ownership_takes (llvm#183583)
When a function was annotated with both
ownership_returnsandownership_takes(orownership_holds), MallocChecker::evalCall wouldfall into the freeing-only branch (isFreeingOwnershipAttrCall) and call
checkOwnershipAttr without first calling MallocBindRetVal. That meant no
heap symbol had been conjured for the return value, so
checkOwnershipAttr later dereferenced a null/invalid symbol and crashed.
Fix: merge the two dispatch branches so that MallocBindRetVal is always
called first whenever ownership_returns is present, regardless of
whether the function also carries ownership_takes/ownership_holds.
The crash was introduced in llvm#106081
339282d.
Released in clang-20, and crashing ever since.
Fixes llvm#183344.
Assisted-By: claude
[X86] Fold XOR of two vgf2p8affineqb instructions with same input (llvm#179900)
This patch implements an optimization to fold XOR of two
vgf2p8affineqbinstructions operating on the same input.This optimization:
Reduces instruction count from 3 to 2
Eliminates one vgf2p8affineqb instruction
Added combineXorWithTwoGF2P8AFFINEQB function in X86ISelLowering.cpp
Uses sd_match pattern matching for consistency with existing code
Checks that both operations have single use to avoid code bloat
Verifies both operations use the same input
Handles commutative XOR patterns automatically
[gn] port 3490d28
[LLVM][Runtimes] Add 'llvm-gpu-loader' to dependency list (llvm#183601)
Summary:
This is used to run the unit tests in libc / libc++. It must exist in
the build directory's binary path, but without this dependnecy we may
not build it before running the runtimes build. This should ensure that
it's present, and only if we have tests enabled.
[AMDGPU][Scheduler] Add
GCNRegPressure-based methods toGCNRPTarget(llvm#182853)This adds a few methods to
GCNRPTargetthat can estimate/perform RPsavings based on
GCNRegPressureinstead of a singleRegister,opening the door to model/incorporate more complex savings made up of
multiple registers of potentially different classes. The scheduler's
rematerialization stage now uses this new API.
Although there are no test changes this is not really NFC since register
pressure savings in the rematerialization stage are now computed through
GCNRegPressureinstead of the stage itself. If anything this makesthem more consistent with the rest of the RP-tracking infrastructure.
[MIR] Error on signed integer in getUnsigned (llvm#183171)
Previously we effectively took the absolute value of the APSInt, instead
diagnose the unexpected negative value.
Change-Id: I4efe961e7b29fdf1d5f97df12f8139aac12c9219
[NFC][SPIRV] Fix compile warnings (llvm#183725)
Fix compile warnings in SPIR-V
[AMDGPU][Scheduler] Fix compilation fail in EXPENSIVE_CHECKS (llvm#183745)
Bug introduced by llvm#182853 (
Rematis now a pointer).[MLIR] Fix OpenACC parser crash with opaque pointers (llvm#183521)
Fixes llvm#181453
Fixes llvm#181589
[MLIR][Vector] Enhance shape_cast unrolling support in case the target shape is [1, 1, ..1] (llvm#183436)
This PR fixes a minor issue in shape_cast unrolling: when all target
dimensions are unit-sized, it no longer removes all leading unit
dimensions.
[SelectionDAG] Fix CLMULR/CLMULH expansion (llvm#183537)
For v8i8 on AArch64,
expandCLMULpicked the zext path (ExtVT=v8i16) since ZERO_EXTEND/SRL were legal, but CLMUL on v8i16 is not, resulting in a bit-by-bit expansion (~42 insns). Prefer the bitreverse path when CLMUL is legal on VT but not ExtVT.v8i8 CLMULR: 42 → 4 instructions.
Fixes llvm#182780
[mlir][Python] Drop Python <=3.9 compatibility path (llvm#183416)
According to PR llvm#163499, minimum Python version for mlir-py is now 3.10,
we no longer need patches for py<=3.9.
This change aligns with the Python version bump policy discussed
here.
[mlir][tensor] Fix crash in tensor.from_elements fold with non-scalar element types (llvm#183659)
The fold for tensor.from_elements attempted to always produce a
DenseElementsAttr by calling DenseElementsAttr::get(type, elements).
However, DenseElementsAttr::get only handles basic scalar element types
(integer, index, float, complex) directly. For other element types such
as vector types, it expects StringAttr (raw bytes) for each element,
which folded constants won't provide — triggering an assertion.
Fix this by guarding the fold: only attempt the DenseElementsAttr fold
when the tensor element type is integer, index, float, or complex.
Fixes llvm#180459
[mlir][transforms] Fix crash in remove-dead-values when function has non-call users (llvm#183655)
processFuncOpasserts that all symbol uses of a function areCallOpInterfaceoperations. This is violated when a function isreferenced by a non-call operation such as
spirv.EntryPoint, whichuses the function symbol for metadata purposes without calling it.
Fix this by replacing the assertion with an early return: if any user of
the function symbol is not a
CallOpInterface, skip the functionentirely. This is safe because the pass cannot determine the semantics
of arbitrary non-call references, so it should leave such functions
alone.
Fixes llvm#180416
[lldb][Process/FreeBSDKernelCore] Implement DoWriteMemory() (llvm#183553)
Implement
ProcessFreeBSDKernelCore::DoWriteMemory()to write data onkernel crash dump or
/dev/mem. Due to safety reasons (e.g. writingwrong value on
/dev/memcan trigger kernel panic), this feature isonly enabled when
plugin.process.freebsd-kernel-core.read-onlyis setto false (true by default).
Since 85a1fe6 (llvm#183237) was reverted as it was prematurely merged, I'm
committing changes again with corrections here.
Signed-off-by: Minsoo Choo minsoochoo0122@proton.me
[SPIRV][NFCI] Use unordered data structures for SPIR-V extensions (llvm#183567)
Review follow-up from llvm#183325
No reason for these data structures to be ordered.
Minor annoyance when trying to use
DenseMapbecause of the C++ codefor enums generated by TableGen, but not too bad.
Signed-off-by: Nick Sarnie nick.sarnie@intel.com
[clang][DebugInfo] Rename _vtable$ to __clang_vtable (llvm#183617)
Discussion is a follow-up from
llvm#182762 (comment)
(where we're discussing how LLDB could make use of this symbol for
vtable detection).
_vtable$is not a reserved identifier in C or C++. In order fordebuggers to reliably use this symbol without accidentally reaching into
user-identifiers, this patch renames it such that it is reserved. The
naming follows the style of the recently added
__clang_trap_msgdebug-info symbol.
[SLP][NFC] Precommit test for zext reorder with duplicate shifts (llvm#183748)
This is a pre-commit test for
llvm#183627,
[SystemZ] Add indirect reference bit XATTR REFERENCE(INDIRECT) for indirect symbol handling support (llvm#183441)
This is the first of three patches aimed to support indirect symbol
handling for the SystemZ backend. This PR introduces a
GOFF:ERAttrtorepresent indirect references, handles indirect symbols within
setSymbolAttribute()by setting the indirect reference bit, and alsoupdates the HLASM streamer to emit
XATTR REFERENCE(INDIRECT)andvarious other combinations.
[mlir][vector] Rename
ReduceMultiDimReductionRank->FlattenMultiReduction(NFC) (llvm#183721)The updated name better captures what the pattern does and matches the
coresponding
populat*hook,populateVectorMultiReductionFlatteningPatterns, that only containsthis pattern.
[pdb] Fix libc++ strict-weak-ordering assertion failures from gsiRecordCmp (llvm#183749)
Builds using libc++ hardening was hitting asserts like
libc++ Hardening assertion
!__comp(*(__first + __a), *(__first + __b)) failed:
Your comparator is not a valid strict-weak ordering
printf-debugging revealed that symbols like "?ST@@3ja" were not
comparing equal with themselves. It turns out the comparison was done
with
return S1.compare_insensitive(S2.data());
and even when &S1 == &S2, S1 and S2.data() may not refer to identical
strings, since data() may not have a null terminator where the StringRef
locally ends.
This fixes the ordering, simplifies the code, and makes it a little
faster :)
Fixes: llvm#163755
[mlir] Enable specifying bytecode producer in mlir-opt. (llvm#182846)
[VPlan] Remove manual region removal when simplifying for VF and UF. (llvm#181252)
Replace manual region dissolution code in
simplifyBranchConditionForVFAndUF with using general
removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates
a (BranchOnCond true) or updates BranchOnTwoConds.
The loop then gets automatically removed by running removeBranchOnConst.
This removes a bunch of special logic to handle header phi replacements
and CFG updates. With the new code, there's no restriction on what kind
of header phi recipes the loop contains.
Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is
technically unrelated, but I could not find an independent test that
would be impacted.
The code to deal with epilogue resume values now needs updating, because
we may simplify a reduction directly to the start value.
PR: llvm#181252
[ThinLTO] Reduce the number of renaming due to promotions (llvm#178587)
Currently for thin-lto, the imported static global values (functions,
variables, etc) will be promoted/renamed from e.g., foo() to
foo.llvm.(). Such a renaming caused difficulties in live patching
since function name is changed ([1]).
It is possible that some global value names have to be promoted to avoid
name collision and linker failure. But in practice, majority of name
promotions can be avoided.
In [2], the suggestion is that thin-lto pre-link decides whether
a particular global value needs name promotion or not. If yes, later on
in thinBackend() the name will be promoted.
I compiled a particular linux kernel version (latest bpf-next tree)
and found 1216 global values with suffix .llvm.. With this patch,
the number of promoted functions is 2, 98% reduction from the
original kernel build.
If some native objects are not participating with LTO, name promotions
have to be done to avoid potential linker issues. So the current
implementation cannot be on by default. But in certain cases, e.g., linux kernel
build, people can enable lld flag --lto-whole-program-visibility to reduce the
number of functions like foo.llvm.().
For ThinLTOCodeGenerator.cpp which is used by llvm-lto tool and a
few other rare cases, reducing the number of renaming due to promotion,
is not implemented as lld flag '-lto-whole-program-visibility' is not supported
in ThinLTOCodeGenerator.cpp for now. In summary, this pull request
only supports llvm-lto2 style workflow.
[1] https://lpc.events/event/19/contributions/2212
[2] https://discourse.llvm.org/t/rfc-avoid-functions-like-foo-llvm-for-kernel-live-patch/89400
Revert "[SPIRV][NFCI] Use unordered data structures for SPIR-V extensions" (llvm#183774)
Reverts llvm#183567
UBSan failure.
[lldb-dap] Improve test performance for 'cancel' request. (llvm#183632)
Update the test to more cleanly handle making a 'blocking' call using a
custom command instead of python
time.sleep, which we cannot easilyinterrupt.
This should improve the overall performance of the tests, locally they
took around 30s and now finish in around 6s.
[clang-scan-deps] Add test for symlink-aliased module map PCM reuse across incremental scans (llvm#183328)
Add a test that verifies symlink aliases to a module map directory
produce the same PCM across incremental scans.
[SPIR-V] Fix non-deterministic compiler output for debug type pointer (llvm#182773)
Fixes: llvm#123791
[clang][modules] Prevent deadlock in module cache (llvm#182722)
When there's a dependency cycle between modules, the dependency scanner
may encounter a deadlock. This was caused by not respecting the lock
timeout. But even with the timeout implemented, leaving
unsafeMaybeUnlock()unimplemented means trying to take a lock after atimeout would still fail and prevent making progress. This PR implements
this API in a way to avoid UB on
std::mutex(when it's unlocked bysomeone else than the owner). Lastly, this PR makes sure that
unsafeUnlock()ends the wait of existing threads, so that they don'tneed to hit the full timeout amount.
This PR also implements
-fimplicit-modules-lock-timeout=<seconds>thatallows tweaking the default 90-second lock timeout, and adds
#pragma clang __debug sleepthat makes it easier to achieve desired executionordering.
rdar://170738600
[flang][NFC] Converted five tests from old lowering to new lowering (part 22) (llvm#183681)
Tests converted from test/Lower: intentout-deallocate.f90
Tests converted from test/Lower/Intrinsics: abs.f90, achar.f90,
acospi.f90, adjustl.f90
[libsycl] Add sycl::context stub (llvm#182826)
Part 1 of changes needed for USM alloc/dealloc impl.
This is part of the SYCL support upstreaming effort. The relevant RFCs
can be found here:
https://discourse.llvm.org/t/rfc-add-full-support-for-the-sycl-programming-model/74080
https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479
Signed-off-by: Tikhomirova, Kseniya kseniya.tikhomirova@intel.com
[flang] Implement -grecord-command-line for Flang (llvm#181686)
Enable Flang to match Clang behavior for command-line recording in DWARF
producer strings when using -grecord-command-line.
Signed-off-by: Yangyu Chen cyy@cyyself.name
[ASan] Enable Internalization for 'asanrtl.bc' in Driver (llvm#182825)
Just like other bitcode libs such as ockl.bc ocml.bc, link asanrtl.bc
with '-mlink-builtin-bitcode' in the driver when GPU ASan is enabled.
[mlir][vector] Fix crashes in MaskOp::fold and CanonializeEmptyMaskOp (llvm#183781)
Two related crashes were fixed in vector.mask handling:
MaskOp::fold() crashes with a null pointer dereference when the mask
is all-true and the mask body has no maskable operation (only a
vector.yield). getMaskableOp() returns nullptr in this case, and the
fold was calling nullptr->dropAllUses(). Fixed by returning failure()
when there is no maskable op, deferring to the canonicalizer.
CanonializeEmptyMaskOp creates an invalid arith.select when the mask
type is a vector (e.g., vector<1xi1>) but the result type is a scalar
(e.g., i32). arith.select with a vector condition requires the value
types to be vectors of the same shape. Fixed by bailing out when any
result type doesn't match the mask shape.
Regression tests are added for both cases.
Fixes llvm#177833
[lldb] Add arithmetic binary addition to DIL (llvm#177208)
[RISCV] Use getCopyFromReg in unit test to match comment. NFC (llvm#183199)
Using physical register 0, aka NoRegister, also just looked suspicious.
Revert "[ThinLTO] Reduce the number of renaming due to promotions (llvm#178587)" (llvm#183782)
There is a conflict with existing code. See
llvm#178587
Revert and resolve the conflict and then will submit later.
[MTE] [HWASan] support more complicated lifetimes
This allows us to support more lifetimes, and also gets rid of
the quadratic call to isPotentiallyReachable.
Reviewers: pcc, usama54321
Reviewed By: pcc
Pull Request: llvm#182425
[bazel] Enable
parse_headersfor llvm/BUILD.bazel (llvm#183680)Instead of excluding the whole package, push any existing parse_headers
failures to individual targets. In some cases we can avoid suppressing a
target by adding a few missing deps.
[SystemZ] Emit external aliases required for indirect symbol handling support (llvm#183442)
This is the second of three patches aimed to support indirect symbol
handling for the SystemZ backend. An external name is added for both MC
sections and symbols and makes the relevant printers and writers utilize
the external name when present. Furthermore, the ALIAS HLASM instruction
is emitted after every XATTR instruction.
Depends on llvm#183441.
[LoopInfo] Preserve profile information in makeLoopInvariant (llvm#174171)
When hoisting loop invariant instructions, we can preserve profile
metadata because it depends solely on the condition (which is loop
invariant) rather than where we are in the control flow graph.
[NFC][POWER] add Pre-Commit test case for Inefficient std::bit_floor(x) (llvm#183363)
add a pre-commit test case for Inefficient asm of std::bit_floor(x) for
powerpc.
[flang-rt] Enable more runtime functions for the GPU target (llvm#183649)
Summary:
This enables primarily
stop.cppanddescriptor.cpp. Requires alittle bit of wrangling to get it to compile. Unlike the CUDA build,
this build uses an in-tree libc++ configured for the GPU. This is
configured without thread support, environment, or filesystem, and it is
not POSIX at all. So, no mutexes, pthreads, or get/setenv.
I tested stop, but i don't know if it's actually legal to exit from
OpenMP offloading.
Reapply "[ValueTracking] Propagate sign information out of loop" (llvm#182512)
LLVM converts sqrt libcall to intrinsic call if the argument is within
the range(greater than or equal to 0.0). In this case the compiler is
not able to deduce the non-negativity on its own. Extended ValueTracking
to understand such loops.
Have created new ABI's for matching Intrinsics with three operands
(those existed only for 2 operands)
matchSimpleTernaryIntrinsicRecurrenceandmatchThreeInputRecurrence.Fixes llvm#174813
[flang] [flang-rt] Addition of the Fortran 2023 TOKENIZE intrinsic. (llvm#181030)
This implements the TOKENIZE intrinsic per the Fortran 2023 Standard.
TOKENIZE is a more complicated addition to the flang intrinsics, as it
is the first subroutine that has multiple unique footprints. Intrinsic
functions have already addressed this challenge, however subroutines and
functions are processed slightly differently and the function code was
not a good 1:1 solution for the subroutines. To solve this the function
code was used as an example to create error buffering within the
intrinsics Process and select the most appropriate error message for a
given subroutine footprint.
A simple FIR compile test was added to show the proper compilation of
each case. A thorough negative path test has also been added, ensuring
that all possible errors are reported as expected.
Testing prior to commit:
= check-flang ==========================================
= check-flang-rt ==========================================
= llvm-test-suite ==========================================
Additionally, (FYI) an executable test has been written and will be
added to the llvm-test-suite under a separate PR.
Co-authored-by: Kevin Wyatt kwyatt@hpe.com
[lldb-dap] Adjust VariableReferenceStorage lifetime management. (llvm#183176)
Adjusting
VariableReferenceStorageto only need to track permanent vstemporary storage by making
VariableStorethe common base class.Moved the subclasses of
VariableStoreinto the Variables.cpp file,since they're no long referenced externally.
Expanding on the tests by adding an updated core dump with variables in
the argument scope we can use to validate variable storage.
[mlir][LLVM] Let decomposeValue/composeValue handle aggregates (llvm#183405)
This commit updates the LLVM::decomposeValue and LLVM::composeValue
methods to handle aggregate types - LLVM arrays and structs, and to have
different behaviors on dealing with types like pointers that can't be
bitcast to fixed-size integers. This allows the "any type" on
gpu.subgroup_broadcast to be more comprehensive - you can broadcast a
memref to a subgroup by decomposing it, for example.
(This branched off of getting an LLM to implement
ValueuboundsOpInterface on subgroup_broadcast, having it add handling
for the dimensions of shaped types, and realizing that there's no
fundamental reason you can't broadcast a memref or the like)
Co-authored-by: Claude Opus 4.6 noreply@anthropic.com
[WebAssembly] Incorporate SCCs into WebAssemblyFixIrreducibleControlFlow (llvm#181755)
Rather than mapping out full "reachability" between blocks in a region
to find loops and using
LoopBlocksto find the bodies of said loops,use SCCs (strongly-connected components) to provide this information.
This brings in LLVM's generic
SCCIterator(which uses Tarjan'salgorithm) as the implementation for sorting the basic blocks of the CFG
into their SCCs.
This PR greatly reduces the compile-time footprint of the pass, making
memory use and time taken negliable where it might have previously
caused stalls and OOM before (e.g. llvm#47793,
usagi-coffee/tree-sitter-abl#114)
Supersedes llvm#179722
Fixes llvm#47793
Fixes llvm#165041 (probably)
Thanks to @jkbz64 for the initial investigations (w/ AI; see llvm#179722)
into why this pass was slow and memory consuming and showing that SCCs
were the key.
Also thanks to the Cheerp compiler project for bringing
SCCIteratortolight in this context (blog
post,
implementation).
[OpenMP] Enable internalization of 'ockl.bc' for OpenMP (llvm#183685)
Fix linking of 'ockl.bc' for OpenMP by switching from
-mlink-bitcode-fileto-mlink-builtin-bitcode[SlotIndexes] Further pack indices to improve spill placement time (llvm#182640)
This patch makes it so that renumbering indices when inserting
instructions into the SlotIndexes analysis renumbers the entire list if
the list is otherwise densely packed. This fixes a case we saw on
AArch64 with a lot of spills where every single spill instruction
insertion required a renumbering of most of the instructions in a large
function, making the operation approximately quadratic.
This is not NFC as heuristics depend on the SlotIndex numbers, although
this should mostly be a wash as LRs should be extended ~equally.
[clang][ssaf] Add
JSONFormatsupport forTUSummaryEncodingThis PR adds
JSONFormatsupport for reading and writingTUSummaryEncoding. The implementation exploits similarities in thestructures of
TUSummaryandTUSummaryEncodingby reusing existingJSONFormatsupport forTUSummary. Duplication of tests has beenavoided by parameterizing the test fixture that runs all relevant
read/write tests against
TUSummary, forTUSummaryEncoding. Thisensures that the two serialization paths remain in lockstep.
[SLP] Reject duplicate shift amounts in matchesShlZExt reorder path (llvm#183627)
In the reordered RHS path of matchesShlZExt, the code never checked that
each shift amount (0, Stride, 2×Stride, …) appears at most once. When
the same shift appeared in multiple lanes, it still filled Order,
producing a non-permutation (e.g. Order = [0,0,0,1]). That led to bad
shuffle masks and miscompilation (e.g. shuffles with poison).
The patch adds an explicit duplicate check: before setting Order[Idx] =
Pos, it ensures Pos has not been seen before, using a SmallBitVector
SeenPositions(VF). If a position is seen twice, the function returns
false and the optimization is not applied.
[SystemZ] Emit external aliases for indirect function descriptors in the ADA section (llvm#183443)
This is the last of the three patches aimed to support indirect symbol
handling for the SystemZ backend.
An external alias is emitted for indirect function descriptors within
the ADA section, rather than a temporary alias, while also setting all
of the appropriate symbol attributes that are needed for the HLASM
streamer to emit the correct XATTR and ALIAS instructions for the
indirect symbols.
Moreover, this patch updates the
CodeGen/SystemZ/zos-ada-relocations.lltest as the ADA section iscurrently the only user of indirect symbols on z/OS.
Depends on llvm#183442.
[InstCombine] Replace alloca with undef size with poison instead of null (llvm#182919)
InstCombine previously replaced an alloca instruction with a null
pointer when the array size operand was undef. While this replacement
may be legal, it still caused invalid IR in cases where the original
alloca was used by
@llvm.lifetimeintrinsics.The spec requires that the pointer operand of
@llvm.lifetime.*must beeither:
Replacing the pointer with null violated this requirement and triggered
verifier errors.
These new changes update InstCombine so that in this scenario the alloca
is replaced with poison instead of null.
[Clang][ItaniumMangle] Fix recursive mangling for lambda init-captures (llvm#182667)
[Clang][ItaniumMangle] Fix recursive mangling for lambda init-captures
Mangle computation for lambda signatures can recurse when a call
operator type
references an init-capture (for example via decltype(init-capture)). In
these
cases, mangling can re-enter the init-capture declaration and cycle back
through
operator() mangling.
Make lambda context publication explicit and independent from numbering
state,
then use that context uniformly during mangling:
ContextDeclinSema::handleLambdaNumbering()beforenumbering, so dependent type mangling can resolve the lambda context
without
recursing through the call operator.
CXXRecordDecl::setLambdaContextDecl()and removeContextDeclfrom
CXXRecordDecl::LambdaNumbering.ASTImporterto import/set lambda context separately fromnumbering.
computation:
getEffectiveDeclContext()getClosurePrefix()mangleLocalName()generic and rely on the computed contextAdd mangling regression coverage in mangle-lambdas.cpp, including:
local init-captures used through decltype
non-local variable-template init-captures in decltype
non-local static inline member init-captures in decltype
Fixes Clang frontend C++ crash on decltype in lambda params llvm/llvm-project#63271
Fixes Stack overflow in clang-query 17, ItaniumMangle.cpp llvm/llvm-project#86240
Fixes ICE in Clang 21 with recursive lambda capture — infinite name mangling loop since clang-17 llvm/llvm-project#139089
[CIR] Implement TryOp flattening (llvm#183591)
This updates the FlattenCFG pass to add flattening for cir::TryOp in
cases where the TryOp contains catch or unwind handlers.
Substantial amounts of this PR were created using agentic AI tools, but
I have carefully reviewed the code, comments, and tests and made changes
as needed. I've left intermediate commits in the initial PR if you'd
like to see the progression.
[clang-format] Fix SpaceBeforeParens with explicit template instantiations (llvm#183183)
This fixes explicit template instantiated functions not having spaces
added/removed based on the value of
SpaceBeforeParens.Attribution Note - I have been authorized to contribute this change on
behalf of my company: ArenaNet LLC
[Driver][SYCL] Add tests for -Xarch_ option forwarding to SYCL JIT compilation. (llvm#178025)
This change adds test coverage to verify that options passed via
-Xarch_<arch> <option>are correctly forwarded to SYCL JITcompilations.
[flang] Fix explanatory messages for generic resolution error (llvm#183565)
The compiler emits messages to explain why each of a generic procedure's
specific procedures is not a match for a given set of actual arguments.
In the case of specific procedures with PASS arguments in derived type
procedure bindings or procedure components, these explanatory messages
are often bogus, because the re-analysis didn't adjust the actual
arguments to account for the PASS argument. Fix.
[clang] stop error recovery in SFINAE for narrowing in converted constant expressions (llvm#183614)
A narrowing conversion in a converted constant expression should produce
an invalid expression so that [temp.deduct.general]p7 is satisfied, by
stopping substitution at this point.
This regression was introduced in llvm#164703, and this will be backported
to clang-22, so no release notes.
Fixes llvm#167709
[libc][math] Refactor bf16sub family to header-only (llvm#182115)
Refactors the bf16sub math family to be header-only.
Closes llvm#182114
Target Functions:
[alpha.webkit.NoDeleteChecker] Check if each field is trivially destructive (llvm#183711)
This PR fixes the bug that NoDeleteChecker and trivial function analysis
were not detecting any non-trivial destruction of class member
variables.
When evaluating a delete expression or calling a destructor directly for
triviality, check if each field in the class and its base classes is
trivially destructive.
[AMDGPU] Handle GFX1250 hazards between WMMA and VOPD (llvm#183573)
Hazards between WMMA and VALU were handled in llvm#149865 but this only
worked for regular VOP* VALU encodings, not for VOPD.
Fixes: llvm#183546
[ASan] Document limitations of container overflow checks (llvm#183590)
Mention that partially poisoning stack objects can
lead to false positives and negatives.
See llvm#182720.
Co-authored-by: Saleem Abdulrasool compnerd@compnerd.org
Revert "[Metal][HLSL] Add support for dumping reflection" (llvm#183818)
Reverts llvm#181258
env PATH=""will prevent finding any binary run byenv.[clang] fix crash when casting a parenthesized unresolved template-id (llvm#183633)
this fix uses ignoreparens() in checkplaceholderexpr to prevent a crash
when an unresolved template-id is wrapped in parentheses. fixes llvm#183505
[clang][modulemap] Lazily load module maps by header name (llvm#181916)
After header search has found a header it looks for module maps that
cover that header. This patch uses the parsed representation of module
maps to do this search instead of relying on FileEntryRef lookups after
stating headers in module maps.
This behavior is currently gated behind the
-fmodules-lazy-load-module-maps-cc1flag.[mlir][cf] Fix crash in simplifyBrToBlockWithSinglePred when branch operand is a block argument of its successor (llvm#183797)
When
simplifyBrToBlockWithSinglePredmerges a block into its solepredecessor, it calls
inlineBlockBeforewhich replaces each blockargument with the corresponding value passed by the branch. If one of
those values is itself a block argument of the successor block, the call
replaceAllUsesWith(arg, arg)is a no-op. Any uses of that argumentoutside the block (e.g. in a downstream block) are therefore not
replaced, and when the successor block is erased the argument is
destroyed while those uses are still live, triggering the assertion
use_empty() && "Cannot destroy a value that still has uses\!"inIRObjectWithUseList::~IRObjectWithUseList.Guard against this by returning early when any branch operand is a block
argument owned by the destination block.
Fixes llvm#126213
[BOLT][AArch64] Add a unittest for compare-and-branch inversion. (llvm#181177)
Checks that isReversibleBranch() returns false
Checks that reverseBranchCondition() adjusts
[WebAseembly] Fix -Wunused-variable in llvm#181755
This variable ends up being unused in builds without assertions. Mark it
[[maybe_unused]] per the coding standards.
[NFC] [HWASan] add test for duplicated lifetime end
Reviewers:
Pull Request: llvm#183807
[NFC] [MTE] add test for duplicated lifetime end
Reviewers:
Pull Request: llvm#183808
Revert "[VPlan] Remove manual region removal when simplifying for VF and UF. (llvm#181252)"
This reverts commit 9c53215.
Appears to cause crashes with ordered reductions, revert while I
investigate
[mlir][LLVM] Let decomposeValue/composeVale pad out larger types (llvm#183825)
Currently, as pointed out in the reviews for llvm#183405, decomposeValues
and composeValues should be able to emit zexts and truncations for cases
like i48 and vector<3xi16> becoming i32s but currently that's an assert.
This commit fixes that limitation.
Co-authored-by: Claude Opus 4.6 noreply@anthropic.com
[Offload] Remove unused data type (llvm#183840)
[VPlan] Support unrolling/cloning masked VPInstructions.
Account for masked VPInstruction when verifying the operands in the
constructor. Fixes a crash when trying to unroll VPlans for predicated
early exits.
[mlir][arith] Add
exacttoindex_cast{,ui}(llvm#183395)The
exactflag with the following semanticscan be added to index_cast and index_castui operations. This unlocks
the following lowerings:
Changes:
ArithExactFlagInterface[lldb] Add synthetic support to formatter_bytecode.py (llvm#183804)
Updates formatter_bytecode.py to support compilation and disassembly for
synthetic formatters, in other words support for multiple functions
(signatures).
This includes a number of other changes:
This work is a prelude the ongoing work of a Python to formatter
bytecode compiler. The python compiler to emit assembly, and this module
(formatter_bytecode) will compile it into binary bytecode.
[lldb] Add skip shared build to more API tests
Fixing test failures on my local desktop with incremental
building.
[libc++] Forward find* algorithms to find_if (llvm#179938)
This propagates any optimizations to the whole family of
findfunctions.
Fix
BuiltinTypeMethodBuilderuninitialized pointer (llvm#183814)From this
comment
on PR llvm#176058, static analysis was flagging
TemplateParamsas notinitialized on all paths. This change fixes it by initializing to
nullptrat declaration.[NFC] Fix use-after-free: track TargetLibraryAnalysis in BasicAAResult invalidation (llvm#183852)
BasicAAResultholds a reference toTargetLibraryInfobut itsinvalidate()function did not checkTargetLibraryAnalysis. When thepass manager destroyed and re-created
TLI(e.g. duringCGSCCinvalidation or
FAM.clear()),BasicAAResultsurvived with a danglingTLIreference.This was exposed by llvm#157495 which added
aliasErrno(), the first codepath that dereferences
TLIfromBasicAAResultduring theCGSCCpipeline, causing a AV when compiling Rust's core library on Arm64
Windows.
This change adds
TargetLibraryAnalysisto the invalidation check soBasicAAResultis properly invalidated when itsTLIreference becomesstale.
[lldb] Change the way the shlib directory helper is set (llvm#183637)
This PR changes the way we set the shlib directory helper. Instead of
setting it while initializing the Host plugin, we register it when
initializing the Python plugin. The motivation is that the current
approach is incompatible with the dynamically linked script
interpreters, as they will not have been loaded at the time the Host
plugin is initialized.
The downside of the new approach is that we set the helper after having
initialized the Host plugin, which theoretically introduces a small
window where someone could query the helper before it has been set.
Fortunately the window is pretty small and limited to when we're
initializing plugins, but it's less "pure" than what we had previously.
That said, I think it balances out with removing the plugin include.
[cmake] Disable -Wdangling-pointer on GCC 12+ (llvm#183593)
GCC 12 started warning on the RAII DAGUpdateListener pattern in
SelectionDAG.h (storing
thisin the constructor). It's a falsepositive -- suppress it the same way we handle -Wno-dangling-reference
(GCC 13+) and -Wno-stringop-overread (GCC 11+).
[CIR] Fix dominance problems with values defined in cleanup scopes (llvm#183810)
We currently encounter dominance verification errors when a value is
defined inside a cleanup scope but used outside the scope. This occurs
when forceCleanup() is used to exit a cleanup scope while a variable is
holding a value that was created in the scope body. Classic codegen
solved this problem by passing a list of values to spill and reload to
forceCleanup(). This change implements that same solution for CIR.
I have also aligned the ScalarExprEmitter::VisitExprWithCleanups
implementation with that of classic codegen, eliminating an extra
lexical scope. This causes temporary allocas to be created at the next
higher existing lexical scope, but I think that's OK since they would be
hoisted there anyway by a later pass.
[mlir][GPU] Add ValueBoundsOphinterface to gpu.subgroup_broadcast (llvm#183848)
This commit adds an ValueBoundsOpInterface to gpu.subgroup_broadcast,
matching its integer range interface implementation, so that affine
analysis can peek through subgroup broadcast ops.
[Hexagon] Define HVX_IEEE_FP when -mhvx-ieee-fp is enabled (llvm#183829)
Add a HVX_IEEE_FP define when the compiler is invoked with
-mhvx-ieee-fp flag
[CMake] Propagate dependencies to OBJECT libraries in
add_llvm_library(llvm#183541)Previously, transitively inherited calls to
target_include_directories(foo SYSTEM ...)were being squashed into aflat list of includes, effectively stripping off
-isystemandunintentionally forwarding warnings from such dependencies.
To correctly propagate
SYSTEMdependencies, usetarget_link_librariesto forward the parent target's link dependenciesto the OBJECT library (similar to the
_staticflow below). Unlike aflat
target_include_directories, this lets CMake resolve transitiveSYSTEM include directories through the proper dependency chain.
Note that
target_link_librarieson an OBJECT library propagates allusage requirements, not just includes. This also brings in transitive
INTERFACE_COMPILE_DEFINITIONS,INTERFACE_COMPILE_OPTIONS, andINTERFACE_COMPILE_FEATURES. This is arguably more correct, as theOBJECT library compiles the same sources and should see the same flags.
The existing
target_include_directoriescall is retained for includedirectories set directly on the target (not through link dependencies).
CMake deduplicates include directories that appear through both paths.
Compile definitions and options may technically appear twice (once via
the OBJECT library, once via the consuming target), but duplicate
-Dand flag entries should be harmless in practice.
Also fix
clang_target_link_librariesandmlir_target_link_librariesto forward the link type (PUBLIC/PRIVATE/INTERFACE) to
obj.*targets.Previously the type keyword was silently dropped, resulting in
plain-signature
target_link_librariescalls. This is now requiredbecause the new keyword-signature call in
llvm_add_librarywouldotherwise conflict (CMake requires all calls on a target to use the same
signature).
[lldb] Fix sys.path manipulation failure in formatter_bytecode.py (llvm#183868)
Fix bug in llvm#183804.
[SLP]Fix operand reordering when estimating profitability of operands
Need to swap operand for a single instruction, not for the the same lane
of the first and second instruction in the list
[CIR][NFC] Move some builtin tests to the CodeGenBuitins folder (llvm#183607)
This moves a few tests that were created in the wrong location. Also
changes the names of some test files to maintain consistency.
[AMDGPU][SIInsertWaitcnts] Move VCCZ workaround code out of the way (llvm#182619)
This is a cleanup patch that moves the VCCZ specific workaround code
from
SIInsertWaitcnts::insertWaitcntInBlock()to a separate class andrefactors it a bit to make it easier to read.
The end result is a simpler
insertWaitcntInBlock().Should be NFC.
Revert "[WebAssembly] Incorporate SCCs into WebAssemblyFixIrreducibleControlFlow (llvm#181755)" (llvm#183872)
This reverts commit c05e323.
Changes failed Emscripten tests.
[AMDGPU] Remove extra pipes from load-saddr-offset-imm.ll (llvm#183874)
This test uses opt to run instcombin and then pipes that into llc which
has its output piped into FileCheck. Before this patch, the test also
piped in the source file into llc as well, which caused issues with a
downstream test executor that executes the lines in bash. However, these
extra pipes don't make sense anyways, so remove them.
[libc][math] Refactor floor family to header-only (llvm#182194)
Refactors the floor math family to be header-only.
Closes llvm#182193
Target Functions:
[libc][math][c23] implement C23
acospifmath function (llvm#183661)Implementing C23
acospimath function for single-precision with theheader-only approach that is followed since llvm#147386
Clang: Deprecate float support from __builtin_elementwise_max (llvm#180885)
Now we have
__builtin_elementwise_maxnum
__builtin_elementwise_maximum
__builtin_elementwise_maximumnum
[VPlan] Don't adjust trip count for DataAndControlFlowWithoutRuntimeCheck (llvm#183729)
Previously, the canonical IV increment may have overflowed to a non-zero
value due to vscale being a non power-of-two. So we used to emit a
runtime check for this.
If you didn't want the runtime check,
DataAndControlFlowWithoutRuntimeCheck skipped it and instead tweaked the
trip count so it wouldn't overflow.
However llvm#144963 stopped the check from ever being emitted because vscale
is always a power-of-two on AArch64 and RISC-V, so it never overflowed
to a non-zero value. And in llvm#183292 the code to emit the check was
removed. But we never restored the trip count back to normal when the
target's vscale was a power-of-two.
Now that vscale is always a power-of-two, this PR avoids adjusting it. A
follow up NFC can then remove DataAndControlFlowWithoutRuntimeCheck.
[CIR] Infrastructure and MemorySpaceAttrInterface for Address Spaces (llvm#179073)
Related: llvm#175871,
llvm#179278,
llvm#160386
Introducing the LangAddressSpace enum with offload address space kinds
(offload_private, offload_local, offload_global, offload_constant,
offload_generic) and the LangAddressSpaceAttr attribute.
Generalizes CIR AS attributes as MemorySpaceAttrInterface and Attaches
it to
PointerType. Includes test coverage for valid IR roundtrips andinvalid address space parsing.
This starts a series of patches with the purpose of bringing complete
address spaces support features for CIR. Most of the test coverage is
provided in subsequent patches further down the stack. note that most of
these patches are based on: llvm/clangir#1986
[CIR] Use
-verifyon clang/test/CIR/CodeGenHLSL/matrix-element-expr-load.hlsl (llvm#182817)Update clang/test/CIR/CodeGenHLSL/matrix-element-expr-load.hlsl to use
-verifywith expected CIR NYI diagnostics.[CodeGen] Allow
-enable-ext-tsp-block-placementand-apply-ext-tsp-for-sizepassed together (llvm#183642)Currently, the asserts fires when both
UseExtTspForPerfandUseExtTspForSizeare true on a given function.Ideally, we should allow
-enable-ext-tsp-block-placementand-apply-ext-tsp-for-sizepassed together, meaning run the blockplacement for performance on hot functions, while run the placement for
size on cold functions.
The diff makes
UseExtTspForPerfandUseExtTspForSizemutuallyexclusive per-function: functions with the
OptForSizeattribute useext-tsp block placement for size, while the others use ext-tsp block
placement for perf.
Co-authored-by: Sharon Xu sharonxu@fb.com
[clang-tidy] Teach
misc-unused-using-declsthat exported using-decls aren't unused (llvm#183638)Fixes llvm#162619.
AArch64: Replace @plt/%gotpcrel in data directives with %pltpcrel %gotpcrel (llvm#155776)
Similar to llvm#132569 for RISC-V, replace the unofficial
@pltand@gotpcrelrelocation specifiers, currently only used by clang-fexperimental-relative-c++-abi-vtables, with %pltpcrel %gotpcrel. The
syntax is not used in humand-written assembly code, and is not supported
by GNU assembler.
Also replace the recent
@funcinitwith%funcinit(x).[AMDGPU] Fix piggybacking after commute in AMDGPULowerVGPREncoding (llvm#183778)
After successfully commuting an instruction to be compatible with the
current VGPR MSB mode, update CurrentMode with the commuted
instruction's mode requirements. This locks in the mode bits the
commuted instruction relies on, preventing later instructions from
piggybacking and corrupting those bits.
Without this fix, a subsequent instruction needing a different mode
could piggyback onto the preceding s_set_vgpr_msb and change mode bits
that the commuted instruction depends on. For example, a nullopt src1
position (treated as 0) could be overwritten to a different value,
causing incorrect register encoding for the commuted instruction.
The fix still allows compatible piggybacking - instructions that only
add new mode bits without changing existing ones can still piggyback.
[Driver] Add -Wa,--reloc-section-sym= to control section symbol conversion (llvm#183472)
Wire the llvm-mc --reloc-section-sym={all,internal,none} option through
the clang driver (-Wa,--reloc-section-sym=) and cc1as
(--reloc-section-sym=). The option is only valid for ELF targets.
GNU Assembler will add the option as well.
[lldb/test] Skip TestDelayInitDependency on remote platforms (llvm#183885)
This test exercises macOS-specific linker functionality (-delay_library)
and uses a hardcoded local working directory for the launch info. It
should not run against a remote platform where neither condition holds.
Signed-off-by: Med Ismail Bennani ismail@bennani.ma
InstCombine: Stop applying nofpclass from use nofpclass attribute (llvm#183835)
Functionally reverts a80d432, with new
test.
This should be applied somewhere, but this is the wrong place.
Fixes regression reported after llvm#182444
RISCVMCAsmInfo: Remove redundant
UseAtForSpecifier = false. NFC (llvm#183890)UseAtForSpecifier defaults to false in MCAsmInfo, and RISCVMCAsmInfo
never calls initializeAtSpecifiers (which sets it to true).
[LV] Add tail-folding & required scalar epilogue tests for IG narrowing.
Add additional tests to cover missing code paths when narrowing
interleave groups:
[llvm][DebugInfo] Bump DWARFContext maximum DWARF version (llvm#183838)
In order to start testing DWARFv6 feature support we need to bump this
version for tooling to work.
This does not mean we officially support DWARFv6. It just enables us
testing the features gradually.
[llvm][DebugInfo] Bump DWARFDebugLine maximum DWARF version (llvm#183841)
Bumps
.debug_linemaximum supported version to DWARFv6.This does not mean we officially support DWARFv6. It just enables us
testing the features gradually.
[llvm][DebugInfo] Bump DWARFListTable maximum DWARF version (llvm#183859)
Bumps
.debug_rnglistsmaximum supported version to DWARFv6.This does not mean we officially support DWARFv6. It just enables us
testing the features gradually.
Added unit-test since there was no prior test in the entire LLVM
test-suite that checked this.
[ARM][MVE] Add SLI and SRI recognition. (llvm#183471)
This uses the newly added code from llvm#182051 to optimize to MVE sli and
sri. The only major difference is the legal types supported, but we also
lower intrinsics via VSLIIMM/VSLIIMM, so that only one tablegen pattern
is needed.
[CMake][LLVM] Disable PCH on Clang for file with custom flags too (llvm#183813)
Precompiled headers are already skipped when building ConstantFolding.cpp with MSVC, they cause problems with Clang too so disable it there the same way.
[llvm-mc][dwarf] Bump supported version to DWARF 6 (llvm#183779)
Depends on:
Bumps the supported version to 6. Unit header layout hasn't changed
between versions AFAIK, so re-used the DWARF5
FileCheckin the test.This by no means claims full DWARFv6 support, but is handy for testing
DWARFv6 features while full support is being gradually implemented.
[mlir][tensor] Fix crash in expand_shape fold with dynamic result type (llvm#183785)
foldReshapeOp(inReshapeOpsUtils.h) andFoldReshapeWithConstant(in
TensorOps.cpp) both tried to create a newDenseElementsAttrconstant when folding a reshape op whose operand is a constant. Neither
checked that the result type was statically shaped before doing so, but
DenseElementsAttr::reshape()andDenseElementsAttr::getFromRawBuffer()both asserthasStaticShape().Guard both fold paths with a
hasStaticShape()check so they returnearly when the result type contains a dynamic dimension.
Fixes llvm#177845
[mlir][test-ir-visitors] Fix noSkipBlockErasure crash with block args used across blocks (llvm#183828)
The noSkipBlockErasure callback in TestVisitors.cpp dropped uses of op
results within the same region before erasing a block, but did not drop
uses of the block's own arguments (e.g. function entry block arguments).
When the block was subsequently erased its block arguments were
destroyed while their use-lists were still non-empty, triggering the
assertion in IRObjectWithUseList::~IRObjectWithUseList().
Fix this by also iterating over the block's arguments and dropping any
uses that belong to the same parent region. This mirrors the existing
logic for op result uses and makes the block-erasure walk handle IRs
where function arguments are consumed by ops in sibling blocks.
Also replace
block->front().getParentRegion()withblock->getParent()for robustness (avoids UB when the block has noops).
Add a regression test based on the reproducer from
llvm#182996.
Fixes llvm#182996
Restore llvm#125407, Make covmap tolerant of nested Decisions (llvm#183073)
Change(s):
[mlir] Fix crash in testNoSkipErasureCallbacks on empty blocks (llvm#183757)
The
noSkipBlockErasurecallback intestNoSkipErasureCallbackscalledblock->front().getParentRegion()to get the parent region of a block.This dereferences the ilist sentinel node when the block has no
operations, triggering an assertion failure.
Use
block->getParent()instead, which directly returns the regioncontaining the block without requiring any operations to be present.
Fixes llvm#183511
[mlir][affine] Fix crash in linearize_index fold when multi-index is ub.poison (llvm#183816)
AffineLinearizeIndexOp::foldguarded the constant-folding path withllvm::is_contained(adaptor.getMultiIndex(), nullptr), which onlycatches operands that have not been evaluated at all. When an operand
folds to
ub.PoisonAttr, the attribute is non-null so the guard passed,and the subsequent
cast<IntegerAttr>(indexAttr)call crashed with anassertion failure.
Fix by replacing the null-only check with one that requires every
multi-index attribute to be a concrete
IntegerAttr, returningnullptrfor any other attribute (including null and PoisonAttr).Fixes llvm#178204
[AArch64][PAC] Emit
!dbglocations in*_vfpthunk_functions (llvm#179688)The usage of pointers to member functions with Pointer Authentication
requires generation of
*_vfpthunk_functions. These thunk functionscan be later inlined and optimized by replacing the indirect call
instruction with a direct one and then inlining that function call.
In absence of
!dbgmetadata attached to the original call instruction,such inlining ultimately results in an assertion "!dbg attachment points
at wrong subprogram for function" in the assertions-enabled builds. By
manually executing
optwith-verify-eachoption on the LLVM IRproduced by the frontend, an actual issue can be observed: "inlinable
function call in a function with debug info must have a !dbg location"
after the replacement of indirect call instruction with the direct one
takes place.
This commit fixes the issue by attaching artificial
!dbglocations tothe original call instruction (as well as most other instructions in
*_vfpthunk_function) the same way it is done for othercompiler-generated helper functions.
[AArch64] Add fcvt-i256 test cases. NFC
[LV] Remove duplicated IV expression sinking tests. (NFC)
Remove duplicated tests already covered by
llvm/test/Transforms/LoopVectorize/find-last-iv-sinkable-expr.ll.
[VPlan] Materialize UF after unrolling (NFCI).
Move materialization of the symbolic UF directly to unrollByUF. At this
point, unrolling materializes the decision and it is natural to also
materialize the symbolic UF here.
[mlir][IR] Generalize
DenseElementsAttrto custom element types (llvm#183891)DenseElementsAttrsupports only a hard-coded list of element types:int,index,float,complex. This commit generalizes theDenseElementsAttrinfrastructure: it now supports arbitrary elementtypes, as long as they implement the new
DenseElementTypeInterface.The
DenseElementTypeInterfacehas the following helper functions:getDenseElementBitSize: Query the size of an element in bits. (Whenstoring an element in memory, each element is padded to a full byte.
This is an existing limitation of the
DenseElementsAttr; with anexception for
i1.)convertToAttribute: Attribute factory / deserializer. Converts bytesinto an MLIR attribute. The attribute provides the assembly format /
printer for a single element.
convertFromAttribute: Serializer. Converts an MLIR attribute intobytes.
Note:
convertToAttribute/convertFromAttributeare mainly forwriting test cases. For performance reasons,
DenseElementsAttrusersshould work with raw bytes / elements and avoid any API that
materializes MLIR attributes. However, MLIR attributes typically have
human-readable parsers/printers, making them suitable for lit tests and
debugging.
This PR introduces an additional assembly format for
DenseElementsAttrs. There are now two formats. (The existing one iskept for compatibility reasons.)
dense<[1, 2, 3]> : tensor<3xi32>dense<tensor<3xi32> : [1 : i32, 2 : i32, 3 : i32]>The new syntax is needed to disambiguate between "literal" (e.g.,
1)and attribute (e.g.,
1 : i32) when parsing the first token. In theliteral-first syntax, we only parse literals. In the type-first syntax,
we only parse attributes.
The existing
int,index,float,complextypes also implement theDenseElementTypeInterface. This allows us to implementDenseElementsAttr::getandAttributeElementIterator::operator*in ageneric way.
RFC:
https://discourse.llvm.org/t/rfc-allow-custom-element-types-in-denseelementattr/89656
This is a re-upload of llvm#179122.
Precommit tests: strictfp rounding vector f16 intrinsics (llvm#183699)
[mlir][VectorToLLVM] Fix crash in VectorInsertOpConversion with dynamic index (llvm#183783)
VectorInsertOpConversion crashes with an assertion failure when
inserting a sub-vector at a dynamic position into a multi-dimensional
vector. The pattern calls getAsIntegers() on the position, which asserts
that all fold results are compile-time constant attributes.
The existing guard (checking llvm::IsaPred) only covered the
case where a scalar is inserted into the innermost dimension (the
extractvalue path). The guard was missing for the insertvalue path when
inserting a sub-vector at a dynamic position into a nested aggregate.
Fix: add the same guard before the llvm.insertvalue creation to return
failure() gracefully when any position index is dynamic, matching the
behavior of VectorExtractOpConversion.
Fixes llvm#177829
[CMake] Use keyword signature in two additional callsites (llvm#183889)
Fix-forward for llvm#183541.
Two callsites to target_link_libraries were not migrated to the
keyword signature.
Signed-off-by: Itay Bookstein itay.bookstein@nextsilicon.com
Revert "[mlir][IR] Generalize
DenseElementsAttrto custom element types" (llvm#183917)Reverts llvm#183891
Reverting a second time. The build bot failure seems to be
non-deterministic.
[X86] known-pow2.ll - add tests showing failure to handle ISD::EXTRACT_VECTOR_ELT nodes (llvm#183918)
[AMDGPU] Enable shift64 hazard recognition for gfx9 (llvm#183839)
Enable shift64 hazard recognition for gfx9 cores.
Signed-off-by: John Lu John.Lu@amd.com
[CIR] Implement ImplicitValueInitExpr for ComplexType (llvm#183836)
Implement ImplicitValueInitExpr for ComplexType
[AMDGPU] Assert non-array alloca does have a size (llvm#183834)
Refs
https://github.com/llvm/llvm-project/pull/179523/changes#r2851952141
[ARM] Lower strictfp vector fp16 rounding operations similar to default mode (llvm#183700)
Previously the strictfp rounding nodes were lowered using unrolling to
scalar operations, which has negative impact on performance. Partially
this issue was fixed in llvm#180480, this change continues that work and
implements optimized lowering for v4f16 and v8f16.
[lldb][Process/FreeBSDKernelCore] Add ppc64le support (llvm#180669)
This is LLDB versio