merge main into amd-staging #689

z1-cciauto · 2025-11-26T12:06:34Z

No description provided.

The Triple directly has the datalayout string in it, so just use that. The logical flow here is kind of a mess. We were constructing a temporary target machine in the asm parser to infer the datalayout, throwing it away, and then creating another target machine for the actual compilation. The flow of the Triple construction is still convoluted, but we can at least drop the TargetMachine.

Avoids a vector copy.

This matches what ARM does. I'm not sure if there are any bad effects from the duplicate hints. I have seen the duplicates hints in the debug output and confirmed this removes them.

…lvm#169578)

) So we check the offsets before using them.

COMPILE_ONLY was introduced in cmake 3.27.0. We cannot use this feature, because LLVM supports cmake 3.20.0.

…69502)

@mshockwave

As @mshockwave mentioned in llvm#156415, we don't need to declare intrinsics in tests now, this pr removes them.

…fc) (llvm#169533) Adds new builders for `tensor.insert_slice` and `tensor.extract_slice` Ops for which the _offsets_ and the _strides_ are all 0s and 1s, respecitvely. This allows us to write: ```cpp // No offsets and no strides - implicitly set to 0s and 1s, // respectively. tensor::InsertSliceOp::create(rewriter, loc, src, dest, writeSizes); ``` instead of: ```cpp // Strides are initialised explicitly to 1s Attribute oneIdxAttr = rewriter.getIndexAttr(1); SmallVector<OpFoldResult> writeStrides(destRank, oneIdxAttr); // Offsets are initialised explicitly to 0s Attribute zeroIdxAttr = rewriter.getIndexAttr(0); SmallVector<OpFoldResult> writeOffsets(destRank, zeroIdxAttr); tensor::InsertSliceOp::create(rewriter, loc, src, dest, writeOffsets, writeSizes, writeStrides); ```

…ions (llvm#169165) These two functions are almost identical, except for the handling different vector types, so merging them eliminates some duplication. This also fixes some bugs, as "sizeless" vector code was missing checks for several cases. This meant type checking would crash if: - The LHS or RHS type was void - The LHS or RHS type was a fixed-length vector type - There was not a scalable vector type for the result element count/size These are fixed with this patch and tested in Sema/AArch64/sve-vector-conditional-op.cpp. Fixes llvm#169025

…69508)

…lvm#165714) Adds initial support for GPU by-ref reductions. The main problem for reduction by reference is that, prior to this PR, we were shuffling (from remote lanes within the same warp or across different warps within the block) pointers/references to the private reduction values rather than the private reduction values themselves. In particular, this diff adds support for reductions on scalar allocatables where reductions happen on loops nested in `target` regions. For example: ```fortran integer :: i real, allocatable :: scalar_alloc allocate(scalar_alloc) scalar_alloc = 0 !$omp target map(tofrom: scalar_alloc) !$omp parallel do reduction(+: scalar_alloc) do i = 1, 1000000 scalar_alloc = scalar_alloc + 1 end do !$omp end target ``` This PR supports by-ref reductions on the intra- and inter-warp levels. So far, there are still steps to be takens for full support of by-ref reductions, for example: * Support inter-block value combination is still not supported. Therefore, `target teams distribute parallel do` is still not supported. * Support for dynamically-sized arrays still needs to be added. * Support for more than one allocatable/array on the same `reduction` clause.

…otations (llvm#169620) Refactored GSL pointer and owner type detection functions to improve code organization and reusability.

Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.

Split off from PR llvm#163525, this standalone patch replaces almost all the remaining cases where undef is used as value in loop vectoriser tests. This will reduce the likelihood of contributors hitting the `undef deprecator` warning in github. NOTE: The remaining use of undef in iv_outside_user.ll will be fixed in a separate PR. I've removed the test stride_undef from version-mem-access.ll, since there is already a stride_poison test.

…lvm#169076) This commit improves the handling of GetElementPtr (GEP) instructions for Logical SPIR-V. It includes: - Rewriting of GEPs that are not allowed in Logical SPIR-V (specifically, handling non-zero first indices by rebuilding access chains or adjusting types). - Better deduction of element types for pointer casting. - Updates to instruction selection to ensure GEPs are correctly lowered to OpAccessChain or OpInBoundsAccessChain only when valid (e.g. first index 0). - Support for standard HLSL cbuffer layouts in tests.

z1-cciauto · 2025-11-26T12:08:34Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2983

arsenm and others added 22 commits November 26, 2025 04:40

[ORC] Pass FailedSNs by const-ref. NFCI. (llvm#169600)

76ec25f

Avoids a vector copy.

[RISCV] Don't add duplicate Zilsd hints. (llvm#169554)

4e7c65e

This matches what ARM does. I'm not sure if there are any bad effects from the duplicate hints. I have seen the duplicates hints in the debug output and confirmed this removes them.

[AMDGPU] Update strict floating point tests to be more comprehensive (l…

a7f9a4d

…lvm#169578)

[clang] Implement dump() for AddrLabelDiff APValues (llvm#169505)

a57fe84

[clang][bytecode][NFC] Clean up Integral::from() functions (llvm#169513)

8396d4c

[clang][bytecode] Add some convenience API to BitcastBuffer (llvm#169516

6459f39

) So we check the offsets before using them.

[Flang-rt] Remove COMPILE_ONLY from flang-rt CMake file. (llvm#169534)

00aca53

COMPILE_ONLY was introduced in cmake 3.27.0. We cannot use this feature, because LLVM supports cmake 3.20.0.

MC: Remove unneeded parameter MCAsmBackend *. NFC

e04c01b

[clang][bytecode][NFC] Make Program::getNativePointer() const (llvm#1…

97732dd

…69502)

[lldb][NFC] Fix incorrect comments in TestArm64InstEmulation

e493e90

[RISCV] Remove intrinsic declarations in tests, NFC (llvm#167474)

93f2deb

As @mshockwave mentioned in llvm#156415, we don't need to declare intrinsics in tests now, this pr removes them.

[AArch64] Add vector tests for add(trunc(shift))

de674fb

[clang][bytecode][NFC] Remove unused Integral range functions (llvm#1…

63e4b8c

…69508)

[LifetimeSafety] Move GSL pointer/owner type detection to LifetimeAnn…

c43ac96

…otations (llvm#169620) Refactored GSL pointer and owner type detection functions to improve code organization and reusability.

[LoopCacheAnalysis] Fix crash after llvm#164798 (llvm#169486)

3036de7

Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.

merge main into amd-staging

304346f

z1-cciauto requested a review from nicolasvasilache as a code owner November 26, 2025 12:06

z1-cciauto requested a review from a team November 26, 2025 12:06

ronlieb approved these changes Nov 26, 2025

View reviewed changes

z1-cciauto merged commit 3529145 into amd-staging Nov 26, 2025
8 checks passed

z1-cciauto deleted the upstream_merge_202511260706 branch November 26, 2025 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #689

merge main into amd-staging #689

Uh oh!

z1-cciauto commented Nov 26, 2025

Uh oh!

z1-cciauto commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

merge main into amd-staging #689

merge main into amd-staging #689

Uh oh!

Conversation

z1-cciauto commented Nov 26, 2025

Uh oh!

z1-cciauto commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants