merge main into amd-staging #628

ronlieb · 2025-11-19T20:10:34Z

No description provided.

When the first element of a trn mask is undef, the `isTRNMask` function assumes `WhichResult = 1`. That has a 50% chance of being wrong, so we fail to match some valid trn1/trn2. This patch introduces a more precise test to determine the correct value of `WhichResult`, based on corresponding code in the `isZIPMask` and `isUZPMask` functions. - This change is based on llvm#89578. I'd like to follow it up with a further change along the lines of llvm#167235.

… function [NFC] (llvm#168198) This PR moves the code that checks whether an LLVM intrinsic should be generated instead of a call to floating point math functions to a separate function. This simplifies `EmitBuiltinExpr` in `CGBuiltin.cpp` and will allow us to reuse the logic in ClangIR.

…ted code (llvm#168536) ODS generate code can be included and used outside of the `mlir` namespace and so references to symbols in the mlir namespace must be fully qualified.

…les (llvm#165690) NeoverseZeroMove was introduced for Neoverse-V2 and was added to V3 and V3AE. Use NeoverseZeroMove for Neoverse-V1, N2, N3 in the same way, including these instructions: MOV Xd|Wd, #0|XZR|WZR For all the above Neoverse targets, the following instructions are also decoded as not utilizing the scheduling and execution resources of the machine: MOV Wd,Wn MOV Xd,Xn For Neoverse-N3 only, these instructions also have zero latency FMOV Dd, Dn FMOV Sd, Sn MOV Vd, Vn (vector) MOV Zd.D, Zn.D PTRUE PFALSE

This patch replaces the delinearization function used in DA, switching from one that depends on type information in GEPs to one that does not. There are three types of changes in regression tests: improvements, degradations, and degradations but the related features will be removed. Since there were very few cases that are classified into the second category, I believe the impact of this change should be practically insignificant.

Do not consider loops with a zero backedge taken count as candidates for interchange. This seems like a sensible thing because it suggests the loop doesn't execute and there is no point in interchanging. As a bonus, this seems to avoid triggering an assert about phis and their uses from source code, so this is a partial fix for llvm#163954 but it needs more work to properly fix that.

Summary: These were originally intended to represent the functions that are present on the GPU as to be provided by the LLVM libc implementation. The original plan was that LLVM libc would report which functions were supported and then the offload interface would mark those as supported. The problem is that these wrapper headers are very difficult to make work given the various libc extensions everyone does so they were extremely fragile. OpenMP already declares all functions used inside of a target region as implicitly host / device, while these headers weren't even used for CUDA / HIP yet anyway. The only things we need to define right now are the stdio FILE types. If we want to make this work for CUDA we'd need to define these manually, but we're a ways off and that's way easier because they do proper overloading.

…168563) As part of investigating a related issue, I made the following changes to fir::runtime::genCharCompare(): - Renamed a variable - Added an error check for the same kind of input args - Updated another error check to use the same error found elsewhere in this source file

…llvm#168697) llvm#158690 plans on passing BFI as a lazy lambda to avoid computing BlockFrequencyInfo when not needed. In preparation for that, this PR removes BFI and PSI from some constructors that aren't used. It also consolidates the two calls to llvm::shouldOptimizeForSize so that the result is computed once and passed where needed. This also renames OptForSize in LoopVectorizationLegality to clarify that it's to prevent runtime SCEV checks, see https://reviews.llvm.org/D68082

This commits adds the first part of the output semantics. It only considers return values (and sret), but does not handle `inout` or `out` parameters yet. Those missing bits will reuse the same code, but will require additional testing & some fixups, so planning on adding them separately.

The 'atomic capture' variant of the `atomic` construct accepts either a single statement, or a compound statement containing two statements. Each of the statements it accepts meet a form of the previous read/write/update forms, or is a combination of two. The IR node for atomic capture takes two separate other acc.atomics, plus a terminator. This patch implements all of the lowering for these. Note: This gets the postfix-increment/decrement wrong, but the effort to do so is enough that I believe we can do that in a followup patch, so I'll be doing so in the next patch.

Fixes llvm#167991

Fix documentation in `mpi`, `objc`, `openmp`, `performance`, `portability`, `readability` and `zircon`. This is part of the codebase cleanup described in llvm#167098

Unfortunately, in this configuration, the bots are forced to use the system libcxx, which is too old for what this test is verifying. In the future, we should re-enable building libcxx with asan on MacOS.

) Adds some general changes for supporting asan on AIX. Issue: llvm#138916

…-ranked memref (llvm#166959) Vectorization of a 1-d reduction where the output variable is a 1-ranked memref can generate an invalid `vector.transfer_write` with no indices for the memref, e.g.: vector.transfer_write"(%vec, %buff) <{...}> : (vector<f32>, memref<1xf32>) -> () This patch solves the problem by providing the expected amount of indices (i.e. matching the rank of the memref).

…nChecker (llvm#168338) Fixes llvm#166573 --------- Co-authored-by: Donát Nagy <donat.nagy@ericsson.com> Co-authored-by: Alan Li <me@alanli.org>

…lvm#168546) Remove leftover implicit operands from SI_SPILL/SI_RESTORE. --------- Signed-off-by: John Lu <John.Lu@amd.com>

…OMPONENTS (llvm#168407) Fixes llvm#168393. Also adds top-level `MLIR_PYTHON_STUBGEN_ENABLED` CMake option.

Only some fortran source files in flang/test/Lower have been modified. The other files in the directory will be cleaned up in subsequent commits

This test turned out to not actually be that interested. There was just a subshell usage that needed replacing with readfile, and then the test just works. Reviewers: fmayer, DanBlackwell, ndrewh Reviewed By: ndrewh Pull Request: llvm#168654

On AIX, the linker's release cadence is once per year and it doesn't backport non-critical fixes to previous releases. We would like to get thinLTO caching accessible for current customers, so this PR adds the cache flags as cl::opt options.

…llvm#168674) This patch simplifies iterator_range construction with the conversion constructor.

Note that llvm::size only works on types that allow std::distance in O(1).

…lvm#168616) Add a RAII `IfGuardEmitter` to insert simple #if guards and adopt it in InstrInfoEmitter.

- Remove file local functions out of `llvm` or anonymous namespace and make them static. - Use namespace qualifier to define `BoUpSLP` class and several template specializations.

- Add declarations of various `MCAsmParserExtension` creation functions to MCAsmParserExtension.h and use namespace qualifiers to define these and some other functions. - Add end of namespace comments. - Fix indentation of `MCNullStreamer` class. - Remove namespace surrounding code in MCWinEH.cpp and use "using namespace" instead.

…ly bodies (llvm#167523) Fixes llvm#167247 --- This PR addresses a case where Clang emitted `-Wmissing-noreturn` for virtual methods whose body consists of a `throw` expression ```cpp struct Base { virtual void foo() { throw std::runtime_error("error"); } }; ```

Split out from llvm#168288

…68658) This patch uses several shell features not supported by the internal shell, such as $? to get the exit code of a command, and exit. This patch adjusts the test to work with the internal shell by using bash to run the actual command with a zero exit code to ensure the file is deleted, and python to propagate the exit code up to lit.

Ternary with a constant condition and throw in the live part

Upstream Exception EhInflight op as a prerequisite for full catch handlers implementation Issue llvm#154992

This is the promised follow-up to llvm#167779. It simply adds a test case provided by philnik777

@main

…8564) Legalizing following IR to `tosa` using `tf-tosa-opt` from `tensorflow` repo: ``` func.func @main(%arg0: tensor<?x?x?x?xf32>) -> tensor<?x?x?x5xf32> { %0 = "tfl.pseudo_const"() <{value = dense<0.000000e+00> : tensor<5xf32>}> : () -> tensor<5xf32> %1 = tfl.add(%arg0, %0) <{fused_activation_function = "NONE"}> : (tensor<?x?x?x?xf32>, tensor<5xf32>) -> tensor<?x?x?x5xf32> return %1 : tensor<?x?x?x5xf32> } ``` fails with ``` error: 'tosa.add' op operands don't have matching ranks %1 = tfl.add(%arg0, %0) <{fused_activation_function = "NONE"}> : (tensor<?x?x?x?xf32>, tensor<5xf32>) -> tensor<?x?x?x5xf32> ^ tfl.mlir:3:10: note: see current operation: %1 = "tosa.add"(%arg0, %0) : (tensor<?x?x?x?xf32>, tensor<5xf32>) -> tensor<?x?x?x5xf32> // -----// IR Dump After TosaLegalizeTFLPass Failed (tosa-legalize-tfl) //----- // "func.func"() <{function_type = (tensor<?x?x?x?xf32>) -> tensor<?x?x?x5xf32>, sym_name = "main"}> ({ ^bb0(%arg0: tensor<?x?x?x?xf32>): %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<5xf32>}> : () -> tensor<5xf32> %1 = "tosa.add"(%arg0, %0) : (tensor<?x?x?x?xf32>, tensor<5xf32>) -> tensor<?x?x?x5xf32> "func.return"(%1) : (tensor<?x?x?x5xf32>) -> () }) : () -> () ``` This is because of the following check in `computeReshapeOutput` called from `EqualizeRanks` function: ``` if (lowerRankDim != 1 && higherRankDim != 1 && lowerRankDim != higherRankDim) return failure(); ``` Based on the broadcast semantics defined in https://mlir.llvm.org/docs/Traits/Broadcastable/#dimension-inference I think it's legal to allow `lowerRankDim != higherRankDim` if one of them is dynamic. At runtime verifier should enforce that 1. if lowerRankDim is dynamic and higherRankDim is static then the dynamic dim matches the static dim and vice-versa 2. if both are dynamic, they should match It's not necessary to error out during the op construction time.

`numeric_limits` already has an `is_signed` member. We can use that instead of using `std::is_signed`.

NFC patch which moves `DiagnosticsRendering` from `Utility` to `Host`. This refactoring is needed for llvm#168603. It adds a method to check whether the current terminal supports Unicode or not. This will be OS dependent and a better fit for `Host`. Since `Utility` cannot depend on `Host`, `DiagnosticsRendering` must live in `Host` instead.

…ructions (llvm#152557) This PR updates the ReleaseAtCycles for all instructions described in Section 11 of the RVV Spec: Vector Integer Arithmetic Instructions. The data used comes from camel-cdr.

To ensure we stay ahead of the ~6 month time horizon. This new version seems to be mostly small version bumps and minor fixes that probably are not too relevant to us.

This patch makes the metrics container build/push job use the common container build/push actions to simplify the workflow by quite a bit.

Switches to the config added in llvm#164891 Fixes llvm#55924

z1-cciauto · 2025-11-19T20:11:49Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2883

ginsbach and others added 30 commits November 19, 2025 12:53

[MLIR][ODS] Fully qualify namespace for mlir::Attribute in ODS genera…

655662e

…ted code (llvm#168536) ODS generate code can be included and used outside of the `mlir` namespace and so references to symbols in the mlir namespace must be fully qualified.

[RISCV][test] Add sincos-expansion.ll test case

7b8eee6

[libc++] Make views::iota aware of __int128 (llvm#167869)

ad31e11

Fixes llvm#167991

[clang-tidy][docs][NFC] Enforce 80 characters limit (4/4) (llvm#168049)

a7ba8dc

Fix documentation in `mpi`, `objc`, `openmp`, `performance`, `portability`, `readability` and `zircon`. This is part of the codebase cleanup described in llvm#167098

[lldb] Skip TestLibcxxInternalsRecognizer on asan + MacOS

93a1327

Unfortunately, in this configuration, the bots are forced to use the system libcxx, which is too old for what this test is verifying. In the future, we should re-enable building libcxx with asan on MacOS.

[mlir][tensor] Drop unused AffineExpr variable (NFC) (llvm#168651)

1723a51

[asan] Implement address sanitizer on AIX: platform support (llvm#139587

c62fc06

) Adds some general changes for supporting asan on AIX. Issue: llvm#138916

[clang][analyzer] Add defer_lock_t modelling to BlockInCriticalSectio…

b11b7b3

…nChecker (llvm#168338) Fixes llvm#166573 --------- Co-authored-by: Donát Nagy <donat.nagy@ericsson.com> Co-authored-by: Alan Li <me@alanli.org>

[AMDGPU] Remove leftover implicit operands from SI_SPILL/SI_RESTORE. (l…

b79a665

…lvm#168546) Remove leftover implicit operands from SI_SPILL/SI_RESTORE. --------- Signed-off-by: John Lu <John.Lu@amd.com>

[MLIR][Python] make sure stubs get installed with LLVM_DISTRIBUTION_C…

86a82f2

…OMPONENTS (llvm#168407) Fixes llvm#168393. Also adds top-level `MLIR_PYTHON_STUBGEN_ENABLED` CMake option.

[flang][NFC] Strip trailing whitespace from tests (7 of N)

9cd40da

Only some fortran source files in flang/test/Lower have been modified. The other files in the directory will be cleaned up in subsequent commits

[llvm] Construct iterator_range with the conversion constructor (NFC) (…

30e5f76

…llvm#168674) This patch simplifies iterator_range construction with the conversion constructor.

[llvm] Use llvm::size (NFC) (llvm#168675)

19129ea

Note that llvm::size only works on types that allow std::distance in O(1).

[NFC][TableGen] Add IfGuardEmitter and adopt it in InstrInfoEmitter (l…

139f726

…lvm#168616) Add a RAII `IfGuardEmitter` to insert simple #if guards and adopt it in InstrInfoEmitter.

[NFC][LLVM] Namespace cleanup in SLPVectorizer (llvm#168623)

4703195

- Remove file local functions out of `llvm` or anonymous namespace and make them static. - Use namespace qualifier to define `BoUpSLP` class and several template specializations.

arsenm and others added 16 commits November 19, 2025 12:27

DAG: Reorder SDPatternMatch combinators earlier (llvm#168625)

1782e50

Split out from llvm#168288

[CIR] Ternary with const cond and throw in the live part (llvm#168432)

36cbcec

Ternary with a constant condition and throw in the live part

[CIR] Upstream Exception EhInflight op (llvm#165621)

009ec6f

Upstream Exception EhInflight op as a prerequisite for full catch handlers implementation Issue llvm#154992

Add test case for xsgetn in basic_filebuf (llvm#167937)

f65294e

This is the promised follow-up to llvm#167779. It simply adds a test case provided by philnik777

[gn] "port" 5efce73 (arm 32-bit asm compiler-rt)

87a1fd1

[libc++] Remove is_signed<T> use from <limits> (llvm#168334)

8bfd294

`numeric_limits` already has an `is_signed` member. We can use that instead of using `std::is_signed`.

[gn] port c62fc06

449807a

[RISCV] Update X60 ReleaseAtCycles for Vector Integer Arithmetic Inst…

3890a4a

…ructions (llvm#152557) This PR updates the ReleaseAtCycles for all instructions described in Section 11 of the RVV Spec: Vector Integer Arithmetic Instructions. The data used comes from camel-cdr.

[Github] Bump Runner Version in CI Containers

8ab7b60

To ensure we stay ahead of the ~6 month time horizon. This new version seems to be mostly small version bumps and minor fixes that probably are not too relevant to us.

[Github] Make metrics container build use common actions (llvm#168667)

0f615dc

This patch makes the metrics container build/push job use the common container build/push actions to simplify the workflow by quite a bit.

[gn] port 22a2cae (AttrIsTypeDependent.inc)

6f8e87b

[bazel] Flip --enable_bzlmod to true (llvm#168555)

a4456a5

Switches to the config added in llvm#164891 Fixes llvm#55924

merge main into amd-staging

ecd0f61

ronlieb requested review from a team and dpalermo November 19, 2025 20:10

ronlieb requested review from Groverkss, fabianmcg, nicolasvasilache and stellaraccident as code owners November 19, 2025 20:10

dpalermo approved these changes Nov 19, 2025

View reviewed changes

ronlieb removed request for Groverkss, fabianmcg, nicolasvasilache and stellaraccident November 19, 2025 23:38

z1-cciauto merged commit 304af5d into amd-staging Nov 19, 2025
18 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251119122147 branch November 19, 2025 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #628

merge main into amd-staging #628

ronlieb commented Nov 19, 2025

Uh oh!

z1-cciauto commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

48 participants

merge main into amd-staging #628

merge main into amd-staging #628

Conversation

ronlieb commented Nov 19, 2025

Uh oh!

z1-cciauto commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

48 participants