[pull] main from llvm:main #701

pull · 2025-11-14T23:51:04Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

This replaces the 2 bool flags and the anonymous union. This also removes an implicit conversion from Register to unsigned and a call to MCRegister::id(). The ArgDescriptor constructor was always assigning the union through the MCRegister field even for stack offsets. The change to SIMachineFunctionInfo.h fixes a case where getRegister was being called on an unset ArgDescriptor. Since it was only this case, it seemed cleaner to fix it at the caller. The other option would be to make getRegister() return MCRegister() for an unset ArgDescriptor.

…bol checks (#167806) Currently we're picking up and complaining about builtin (and procedure) symbols like null() when defaultmap(none) is set, so I've relaxed the restriction a bit to allow for procedures and named constants to bypass the restriction. It might be the case that we want to tighten it up again in certain aspects in the future.

The ds_gws_* instructions require gds as an operand. However, when nogds is given, it is treated the same as gds. This patch fixes this to disallow nogds.

This was getting joined in ShutDownExcecptionThread (sic) but not cleared. So this function was not safe to call twice, since you aren't supposed to join a thread twice. Sadly, this was called in MachTask::Clear and MachProcess::Destroy, which are both called when you tell debugserver to detach. This didn't seem to cause problems IRL, but the most recent ASAN detects this as an error and calls ASAN::Die, which was causing all the tests that ran detach to fail. I fixed that by moving the clear & test for m_exception_thread to ShutDownExceptionThread. I also fixed the spelling of that routine. And that routine was claiming to return a kern_return_t which no one was checking. It actually returns a kern_return_t if there was a Mach failure and a Posix error if there was a join failure. Since there's really nothing you can do but exit if this fails, which is always what you are in the process of doing when you call this, and since we have already done all the useful logging in ShutDownExceptionThread, I just removed the return value.

…67939) Non-binary output files from the compiler need the `OF_Text` flag set for encoding conversion to be performed correctly on z/OS. --------- Co-authored-by: Tony Tao <tonytao@ca.ibm.com>

Solaris doesn't define RLIMIT_NPROC, so this is expected to fail there. This fixes a test failure in llvm/utils/lit/tests/verbosity.py on Solaris due to this unexpected warning being included in the lit output.

Effectively means we don't need to call into `llvmModule->convertFromNewDbgValues()` anymore. Added a flag to allow users to access the old behavior.

Part of #119709.

Addresses `TODO`s in file.cpp by replacing data copies via for loops with calls to inline_memcpy. Signed-off-by: Shreeyash Pandey <shreeyash335@gmail.com>

With this option we can pass to BOLT names of functions to be printed through a file instead of specifying them all on command line.

_Exit(3) is a fairly simple syscall wrapper whereas exit(3) calls atexit-registered functions + whole lot of stuff that require support for sync primitives. Splitting the tests allows testing the former easily (especially for new port projects) --------- Signed-off-by: Shreeyash Pandey <shreeyash335@gmail.com>

By only defining it if LIBC_ASSERT macro is not defined. Fixes #162392

…68083) Reverts #165276 The newly added test failed on a number of buildbots.

RegisterId can represent a physical register, a MCRegUnit, or an index into a side structure that stores register masks. These 3 types were encoded by using the physical reg, stack slot, and virtual register encoding partitions from the Register class. This encoding scheme alias wasn't well contained so Register::index2StackSlot and Register::stackSlotIndex appeared in multiple places. This patch gives RegisterRef its own encoding defines and separates it from Register. I've removed the generic idx() method in favor of getAsMCReg(), getAsMCRegUnit(), and getMaskIdx() for some degree of type safety. Some places used the RegisterId field of RegisterRef directly as a register. Those have been updated to use getAsMCReg. Some special cases for RegisterId 0 have been removed as it can be treated like a MCRegister by existing code. I think I want to rename the Reg field of RegisterRef to Id, but I'll do that in another patch. Additionally, callers of the RegisterRef constructor need to be audited for implicit conversions from Register/MCRegister to unsigned.

This is needed when building with `LLVM_LINK_LLVM_DYLIB` to build LLVM as a DLL on Windows. This effort is tracked in #109483.

Reverts #166355

This patch adds the target hooks required by Instruction Referencing for the AArch64 target, as mentioned in https://llvm.org/docs/InstrRefDebugInfo.html#target-hooks Which allows the Instruction Referenced LiveDebugValues Pass to track spills and restore instructions. With this patch we can use the `llvm/utils/llvm-locstats/llvm-locstats.py` to see the coverage statistics on a clang.dSYM built with in RelWithDebInfo we can see: coverage with dbg_value: ``` ================================================= Debug Location Statistics ================================================= cov% samples percentage(~) ------------------------------------------------- 0% 5828021 38% (0%,10%) 127739 0% [10%,20%) 143344 0% [20%,30%) 172100 1% [30%,40%) 193173 1% [40%,50%) 127366 0% [50%,60%) 308350 2% [60%,70%) 257055 1% [70%,80%) 212410 1% [80%,90%) 295316 1% [90%,100%) 349280 2% 100% 7313157 47% ================================================= -the number of debug variables processed: 15327311 -PC ranges covered: 67% ------------------------------------------------- -total availability: 62% ================================================= ``` coverage with InstrRef without target hooks fix: ``` ================================================= Debug Location Statistics ================================================= cov% samples percentage(~) ------------------------------------------------- 0% 6052807 39% (0%,10%) 127710 0% [10%,20%) 129999 0% [20%,30%) 155011 1% [30%,40%) 171206 1% [40%,50%) 102861 0% [50%,60%) 264734 1% [60%,70%) 212386 1% [70%,80%) 176872 1% [80%,90%) 242120 1% [90%,100%) 254465 1% 100% 7437215 48% ================================================= -the number of debug variables processed: 15327386 -PC ranges covered: 67% ------------------------------------------------- -total availability: 60% ================================================= ``` coverage with InstrRef with target hooks fix: ``` ================================================= Debug Location Statistics ================================================= cov% samples percentage(~) ------------------------------------------------- 0% 5972267 39% (0%,10%) 118873 0% [10%,20%) 127138 0% [20%,30%) 153181 1% [30%,40%) 170102 1% [40%,50%) 102180 0% [50%,60%) 263672 1% [60%,70%) 212865 1% [70%,80%) 176633 1% [80%,90%) 242403 1% [90%,100%) 264441 1% 100% 7494527 48% ================================================= -the number of debug variables processed: 15298282 -PC ranges covered: 71% ------------------------------------------------- -total availability: 61% ================================================= ``` I believe this should be a good indication that Instruction Referencing should be turned on for AArch64?

…ts (#167597) Issue #147390

Fixes: #166059 --------- Signed-off-by: Shreeyash Pandey <shreeyash335@gmail.com>

…167636) Add a new CMake variable, `LIBCXX_ASSERTION_SEMANTIC`, that largely mirrors `LIBCXX_HARDENING_MODE`, except that it also supports a special value `hardening_dependent` that indicates the semantic will be selected based on the hardening mode in effect: - `fast` and `extensive` map to `quick_enforce`; - `debug` maps to `enforce`.

Address post commit comments from #167958

Also add boilerplate to have a live instance when running opt configured from CommandFlags / TargetOptions.

This commit fixes the validation check for duplicate indices in the TOSA scatter operation when using int64 index tensors. Previously, use of int64 index tensors would cause a crash.

… vectors (#168055) When simplifying min/max intrinsics with fixed-size vector constants, InstructionSimplify attempts to optimize element-wise. However, getAggregateElement() can return null for certain constant expressions like bitcasts, leading to a null pointer dereference. This patch adds a check to bail out of the optimization when getAggregateElement() returns null, preventing the crash while maintaining correct behavior for normal constant vectors. Fixes crash with patterns like: call <2 x half> @llvm.minnum.v2f16(<2 x half> %x, <2 x half> bitcast (<1 x i32> <i32 N> to <2 x half>))

This was previously under the ELF specific options section, but is actually only supported for Mach-O

This also improves the error message to be more clear for folks who haven't used a lot of rst.

…167818)

…IES (#167933) This is a fixed version of #167886. The build previously failed with `BUILD_SHARED_LIBS=ON`. After trying that locally, I uncovered a few other instances of lldb non-plugin libraries depending on clang transitively through lldbValueObject, so I added the correct clang libraries to their dependencies.

) This fixes the -fveclib flag getting lost on its way to the backend. Previously this was its own cl::opt with a random boolean. Move the flag handling into CommandFlags with other backend ABI-ish options, and have clang directly set it, rather than forcing it to go through command line parsing. Prior to de68181, codegen used TargetLibraryInfo to find the vector function. Clang has special handling for TargetLibraryInfo, where it would directly construct one with the vector library in the pass pipeline. RuntimeLibcallsInfo currently is not used as an analysis in codegen, and needs to know the vector library when constructed. RuntimeLibraryAnalysis could follow the same trick that TargetLibraryInfo is using in the future, but a lot more boilerplate changes are needed to thread that analysis through codegen. Ideally this would come from an IR module flag, and nothing would be in TargetOptions. For now, it's better for all of these sorts of controls to be consistent.

… DWARF units in .dwp files. (#167986) This path is updating the reading capabilities of the LLVM DWARF parser for a llvm-dwp patch #167457 that will emit .dwp files where the compile units are DWARF32 and the .debug_str_offsets tables will be emitted as DWARF64 to allow .debug_str sections that exceed 4GB in size.

#168105) …63019)" This reverts commit 92e5608.

This PR makes the following improvements to `vector.scatter` and its lowering pipeline: - In addition to `memref`, accept a ranked `tensor` as the base operand of `vector.scatter`, similar to `vector.transfer_write`. - Implement bufferization support for `vector.scatter`, so that tensor-based scatter ops can be fully lowered to memref-based forms. It's worth to complete the functionality of map_scatter decomposition. Full discussion can be found here: iree-org/iree#21135 --------- Signed-off-by: Ryutaro Okada <1015ryu88@gmail.com>

After the base branch was moved to main, this somehow ended up adding a second definition of RTLCI, instead of modifying the existing one. Also fix other build error with gcc bots.

…68111)

This PR adds a new utility function to check whether symbols used in OpenACC regions are legal for offloading. Functions must be marked with `acc routine` or be built-in intrinsics. Global symbols must be marked with `acc declare`. The utility is designed to be extensible, and the OpenACCSupport analysis has been updated to allow handling of additional symbols that do not necessarily use OpenACC attributes but are marked in a way that still guarantees the symbol will be available when offloading. For example, in the Flang implementation, CUF attributes can be validated as legal symbols.

Extend test to cover different -force-target-instruction-cost settings.

FIROpenACCTransforms needs to link against MLIROpenACCUtils; otherwise, linking will fail: `undefined reference to `mlir::acc::isValidSymbolUse`

…000 (#166634) Getting a gfx000 result from the `rocm-agent-enumerator` command was deprecated beginning with the release of ROCm 7, but the MLIR build system still filters it from results when looking for ROCm agents. This PR removes that filtering. There are a few other uses of "gfx000" in MLIR source, but those are used as default options for running some passes, and, to my understanding, have a semantically different meaning to the dummy result returned from `rocm-agent-enumerator` and don't need to be changed.

Same changes as in fix for [165276](#165276) except for remove unnecessary <vector> include in test to restore Ubuntu build. This is not needed as allocatable modifier is not applicable to the default clause in C/C++. Co-authored-by: Sunil Kuravinakop <kuravina@pe31.hpc.amslabs.hpecorp.net>

This patch adds codegen for CBB and CBH, CB variants operating on bytes and half-words, allowing to fold sign- and zero-extensions. Since if-conversion needs to be able to undo conditional branches, we remember possibly folded zero- and sign-extensions, as well as potentially folded assertzext and assertsext as additional arguments of the CBBAssertExt and CBHAssertExt pseudos during codegen.

This test is failing on the chromium x64 mac build because of invalid MIR. The rest of the patch is okay, so I am just deleting the test for now.

Add MemAlloc effect to the result so that cuf.alloc/cuf.allocate can be recognized by FIR alias analysis.

…kes precedence" The current "longest match takes precedence" rule for warning suppression mappings can be confusing, especially in long suppression files where tracking the length relationship between globs is difficult. For example, with the following rules, it's not immediately obvious why the first one should currently take precedence: ``` src:*test/* src:*lld/*=emit ``` This commit changes the multi-match behavior so the last match takes precedence. This rule is easier to understand and consistent with the approach used by sanitizers, simplifying the mechanism by providing a uniform experience across different tools. This is potentially breaking, but very unlikely. An investigation of known uses showed they do not rely on the length. Reviewers: thurstond, kadircet, fmayer Pull Request: #162237

This allows SDNodes to be validated against their expected type profiles and reduces the number of changes required to add a new node. There is a couple of nodes that are missing description and one node that fails validation. Part of #119709. Pull Request: #168120

…ets. (#168124) Fixes a buildbot issue stemming from #167986

…68130) Reverts #168112

The apfloat code was added in #167848, and some bazel was added in #167916 but the runtime library for test-apfloat-emulation.mlir was missed. This patch adds the appropriate target.

…67193) This method is not used anywhere. Remove it.

…168138)

These instructions use `src0`, `imm`, `src1` as operand. Fixes SWDEV-566579.

…sCommutable Need to check if the non-copyable element is an instruction before actually trying to check its NSW attribute.

… C API (#168145) Diagnose unsupported configurations when targeting the Python Limited C API. I used SEND_ERROR so that if there's multiple issues, you don't need to keep reconfiguring.

…abs in isCommutable" This reverts commit ddf5bb0 to fix buildbots https://lab.llvm.org/buildbot/#/builders/11/builds/28083.

I think this is quite a bit more readable than the nested conditionals. From review feedback that was not addressed precommitn in #167973.

This PR upstreams the codegen for the x86 vec_ext builtins from the incubator. It is part of #167752.

topperc and others added 30 commits November 14, 2025 09:54

[AMDGPU][MC] Disallow nogds in ds_gws_* instructions (#166873)

306f49a

The ds_gws_* instructions require gds as an operand. However, when nogds is given, it is treated the same as gds. This patch fixes this to disallow nogds.

[Support] Prevent loss of file type flags when creating temporary (#1…

52f2a94

…67939) Non-binary output files from the compiler need the `OF_Text` flag set for encoding conversion to be performed correctly on z/OS. --------- Co-authored-by: Tony Tao <tonytao@ca.ibm.com>

[utils] don't warn when setting rlimit fails on Solaris (#167921)

8e4209a

Solaris doesn't define RLIMIT_NPROC, so this is expected to fail there. This fixes a test failure in llvm/utils/lit/tests/verbosity.py on Solaris due to this unexpected warning being included in the lit output.

[MLIR][LLVM] Debug info: import debug records directly (#167812)

3f0ef27

Effectively means we don't need to call into `llvmModule->convertFromNewDbgValues()` anymore. Added a flag to allow users to access the old behavior.

[Xtensa] TableGen-erate SDNode descriptions (#166253)

05e94c9

Part of #119709.

[libc] replace for loops with a call to memcpy in File (#165219)

bbece4b

Addresses `TODO`s in file.cpp by replacing data copies via for loops with calls to inline_memcpy. Signed-off-by: Shreeyash Pandey <shreeyash335@gmail.com>

[BOLT][print] Add option '--print-only-file' (NFC) (#168023)

ac6daa8

With this option we can pass to BOLT names of functions to be printed through a file instead of specifying them all on command line.

[libc] Allow user-defined LIBC_ASSERT macro. (#168087)

cfce4a6

By only defining it if LIBC_ASSERT macro is not defined. Fixes #162392

Revert "[Clang][OpenMP] Bug fix Default clause variable category" (#1…

8b105cb

…68083) Reverts #165276 The newly added test failed on a number of buildbots.

lldb: Link delayimp on Windows (#168093)

d06a7dd

This is needed when building with `LLVM_LINK_LLVM_DYLIB` to build LLVM as a DLL on Windows. This effort is tracked in #109483.

Revert "[libc][test] split exit tests into two separate tests" (#168102)

5b798df

Reverts #166355

[MemCpyOpt][profcheck] Set unknown branch weights for certain selec…

17789e9

…ts (#167597) Issue #147390

[libc] fix EXPECT_EXIT suspend/timeout for darwin (#166065)

b9c769b

Fixes: #166059 --------- Signed-off-by: Shreeyash Pandey <shreeyash335@gmail.com>

[libc][POSIX][RISCV] Disabled clock_settime on RV32 (#168006)

9d7e341

[mlir][NVVM][NFC] Remove useless options form run lines (#168098)

07740fb

Address post commit comments from #167958

DeclareRuntimeLibcalls: Use RuntimeLibraryAnalysis (#167995)

f7a8d20

Also add boilerplate to have a live instance when running opt configured from CommandFlags / TargetOptions.

[mlir][tosa] Fix scatter duplicate indices check for int64 (#168085)

70b7958

This commit fixes the validation check for duplicate indices in the TOSA scatter operation when using int64 index tensors. Previously, use of int64 index tensors would cause a crash.

[docs] Fix llvm-strip -T flag section (#167987)

9fcb675

This was previously under the ELF specific options section, but is actually only supported for Mach-O

[docs] Fix invalid header length in llvm-ir2vec.rst (#168104)

0bdbf2c

This also improves the error message to be more clear for folks who haven't used a lot of rst.

[AMDGPU] Update buffer fat pointer docs for gfx1250, fix formatting (#…

0190951

…167818)

clayborg and others added 26 commits November 14, 2025 11:22

Revert "[Transform][LoadStoreVectorizer] allow redundant in Chain (#1… (

a407d02

#168105) …63019)" This reverts commit 92e5608.

opt: Fix bad merge of #167996 (#168110)

862d346

After the base branch was moved to main, this somehow ended up adding a second definition of RTLCI, instead of modifying the existing one. Also fix other build error with gcc bots.

DebugInfo: Relax codeview-empty-dbg-cu-crash test's version check (#1…

dbd97c8

…68111)

[LV] Also cover -force-target-instruction-cost=1 in tests.

77fd6be

Extend test to cover different -force-target-instruction-cost settings.

[flang][acc] Add missing dependency on MLIROpenACCUtils (#168117)

dc491d9

FIROpenACCTransforms needs to link against MLIROpenACCUtils; otherwise, linking will fail: `undefined reference to `mlir::acc::isValidSymbolUse`

Remove instr-ref-target-hooks-sp-clobber.mir (#168125)

c40a694

This test is failing on the chromium x64 mac build because of invalid MIR. The rest of the patch is okay, so I am just deleting the test for now.

[flang][cuf] Add to cuf.alloc/cuf.allocate mem alloc effect (#167414)

1429628

Add MemAlloc effect to the result so that cuf.alloc/cuf.allocate can be recognized by FIR alias analysis.

Don't check frame base as varies if registers are available from targ…

4881512

…ets. (#168124) Fixes a buildbot issue stemming from #167986

Revert "[Clang][OpenMP] Bug fix Default clause variable category" (#1…

944278f

…68130) Reverts #168112

[mlir][bazel] Add apfloat test library (#168115)

2743543

The apfloat code was added in #167848, and some bazel was added in #167916 but the runtime library for test-apfloat-emulation.mlir was missed. This patch adds the appropriate target.

[NFC][Support] Remove unused getLongestMatch from SpecialCaseList (#1…

825ebef

…67193) This method is not used anywhere. Remove it.

[lldb] Add a test for capturing stdout/stderr from Python commands (#…

6dad2c2

…168138)

[AMDGPU] Fix wrong MSB encoding for V_FMAMK instructions (#168107)

72a6ae6

These instructions use `src0`, `imm`, `src1` as operand. Fixes SWDEV-566579.

[SLP]Check if the copyable element is a sub instruciton with abs in i…

ddf5bb0

…sCommutable Need to check if the non-copyable element is an instruction before actually trying to check its NSW attribute.

[lldb] Diagnose unsupported configurations when targeting the Limited…

459a64b

… C API (#168145) Diagnose unsupported configurations when targeting the Python Limited C API. I used SEND_ERROR so that if there's multiple issues, you don't need to keep reconfiguring.

Revert "[SLP]Check if the copyable element is a sub instruciton with …

e8cc0d2

…abs in isCommutable" This reverts commit ddf5bb0 to fix buildbots https://lab.llvm.org/buildbot/#/builders/11/builds/28083.

[ProfCheck] Refactor Select Instrumentation to use Early Exits (#168086)

4c4ffd3

I think this is quite a bit more readable than the nested conditionals. From review feedback that was not addressed precommitn in #167973.

[CIR] Upstream CIR codegen for vec_ext x86 builtins (#167942)

e02fdf0

This PR upstreams the codegen for the x86 vec_ext builtins from the incubator. It is part of #167752.

pull bot locked and limited conversation to collaborators Nov 14, 2025

pull bot added the ⤵️ pull label Nov 14, 2025

pull bot merged commit e02fdf0 into optimizecompile:main Nov 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #701

[pull] main from llvm:main #701

Uh oh!

pull bot commented Nov 14, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

41 participants

[pull] main from llvm:main #701

[pull] main from llvm:main #701

Uh oh!

Conversation

pull bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

41 participants

pull bot commented Nov 14, 2025 •

edited

Loading