Skip to content

Conversation

@z1-cciauto
Copy link
Collaborator

No description provided.

arsenm and others added 22 commits November 26, 2025 04:40
The Triple directly has the datalayout string in it, so just
use that.

The logical flow here is kind of a mess. We were constructing
a temporary target machine in the asm parser to infer the datalayout,
throwing it away, and then creating another target machine for the
actual compilation. The flow of the Triple construction is still
convoluted, but we can at least drop the TargetMachine.
This matches what ARM does. I'm not sure if there are any bad effects
from the duplicate hints. I have seen the duplicates hints in the debug
output and confirmed this removes them.
COMPILE_ONLY was introduced in cmake 3.27.0. We cannot use this feature,
because LLVM supports cmake 3.20.0.
As @mshockwave mentioned in
llvm#156415, we don't need to
declare intrinsics in tests now, this pr removes them.
…fc) (llvm#169533)

Adds new builders for `tensor.insert_slice` and `tensor.extract_slice`
Ops for which the _offsets_ and the _strides_ are all 0s and 1s,
respecitvely. This allows us to write:
```cpp
// No offsets and no strides - implicitly set to 0s and 1s,
// respectively.
tensor::InsertSliceOp::create(rewriter, loc, src, dest, writeSizes);
```

instead of:
```cpp
// Strides are initialised explicitly to 1s
Attribute oneIdxAttr = rewriter.getIndexAttr(1);
SmallVector<OpFoldResult> writeStrides(destRank, oneIdxAttr);

// Offsets are initialised explicitly to 0s
Attribute zeroIdxAttr = rewriter.getIndexAttr(0);
SmallVector<OpFoldResult> writeOffsets(destRank, zeroIdxAttr);

tensor::InsertSliceOp::create(rewriter, loc, src, dest, writeOffsets,
                              writeSizes, writeStrides);
```
…ions (llvm#169165)

These two functions are almost identical, except for the handling
different vector types, so merging them eliminates some duplication.
This also fixes some bugs, as "sizeless" vector code was missing checks
for several cases.

This meant type checking would crash if:

 - The LHS or RHS type was void
 - The LHS or RHS type was a fixed-length vector type
- There was not a scalable vector type for the result element count/size

These are fixed with this patch and tested in
Sema/AArch64/sve-vector-conditional-op.cpp.

Fixes llvm#169025
…lvm#165714)

Adds initial support for GPU by-ref reductions. The main problem for
reduction by reference is that, prior to this PR, we were shuffling
(from remote lanes within the same warp or across different warps within
the block) pointers/references to the private reduction values rather
than the private reduction values themselves.

In particular, this diff adds support for reductions on scalar
allocatables where reductions happen on loops nested in `target`
regions. For example:

```fortran
  integer :: i
  real, allocatable :: scalar_alloc

  allocate(scalar_alloc)
  scalar_alloc = 0

  !$omp target map(tofrom: scalar_alloc)
  !$omp parallel do reduction(+: scalar_alloc)
  do i = 1, 1000000
    scalar_alloc = scalar_alloc + 1
  end do
  !$omp end target
```

This PR supports by-ref reductions on the intra- and inter-warp levels.

So far, there are still steps to be takens for full support of by-ref
reductions, for example:
* Support inter-block value combination is still not supported.
Therefore, `target teams distribute parallel do` is still not supported.
* Support for dynamically-sized arrays still needs to be added.
* Support for more than one allocatable/array on the same `reduction`
clause.
…otations (llvm#169620)

Refactored GSL pointer and owner type detection functions to improve code organization and reusability.
Fix the assertion failure after llvm#164798. The issue is that the
comparison `Sizes.back() == ElementSize` can fail when their types are
different. We should cast them to the wider type before the comparison.
Split off from PR llvm#163525, this standalone patch replaces almost all the
remaining cases where undef is used as value in loop vectoriser tests.
This will reduce the likelihood of contributors hitting the `undef
deprecator` warning in github.
NOTE: The remaining use of undef in iv_outside_user.ll will be fixed in
a separate PR.

I've removed the test stride_undef from version-mem-access.ll, since
there is already a stride_poison test.
…lvm#169076)

This commit improves the handling of GetElementPtr (GEP) instructions
for
Logical SPIR-V. It includes:

- Rewriting of GEPs that are not allowed in Logical SPIR-V
(specifically,
handling non-zero first indices by rebuilding access chains or adjusting
  types).
- Better deduction of element types for pointer casting.
- Updates to instruction selection to ensure GEPs are correctly lowered
to
OpAccessChain or OpInBoundsAccessChain only when valid (e.g. first index
0).
- Support for standard HLSL cbuffer layouts in tests.
@z1-cciauto z1-cciauto requested a review from a team November 26, 2025 12:06
@z1-cciauto
Copy link
Collaborator Author

@z1-cciauto z1-cciauto merged commit 3529145 into amd-staging Nov 26, 2025
8 checks passed
@z1-cciauto z1-cciauto deleted the upstream_merge_202511260706 branch November 26, 2025 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.