forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The Triple directly has the datalayout string in it, so just use that. The logical flow here is kind of a mess. We were constructing a temporary target machine in the asm parser to infer the datalayout, throwing it away, and then creating another target machine for the actual compilation. The flow of the Triple construction is still convoluted, but we can at least drop the TargetMachine.
Avoids a vector copy.
This matches what ARM does. I'm not sure if there are any bad effects from the duplicate hints. I have seen the duplicates hints in the debug output and confirmed this removes them.
COMPILE_ONLY was introduced in cmake 3.27.0. We cannot use this feature, because LLVM supports cmake 3.20.0.
As @mshockwave mentioned in llvm#156415, we don't need to declare intrinsics in tests now, this pr removes them.
…fc) (llvm#169533) Adds new builders for `tensor.insert_slice` and `tensor.extract_slice` Ops for which the _offsets_ and the _strides_ are all 0s and 1s, respecitvely. This allows us to write: ```cpp // No offsets and no strides - implicitly set to 0s and 1s, // respectively. tensor::InsertSliceOp::create(rewriter, loc, src, dest, writeSizes); ``` instead of: ```cpp // Strides are initialised explicitly to 1s Attribute oneIdxAttr = rewriter.getIndexAttr(1); SmallVector<OpFoldResult> writeStrides(destRank, oneIdxAttr); // Offsets are initialised explicitly to 0s Attribute zeroIdxAttr = rewriter.getIndexAttr(0); SmallVector<OpFoldResult> writeOffsets(destRank, zeroIdxAttr); tensor::InsertSliceOp::create(rewriter, loc, src, dest, writeOffsets, writeSizes, writeStrides); ```
…ions (llvm#169165) These two functions are almost identical, except for the handling different vector types, so merging them eliminates some duplication. This also fixes some bugs, as "sizeless" vector code was missing checks for several cases. This meant type checking would crash if: - The LHS or RHS type was void - The LHS or RHS type was a fixed-length vector type - There was not a scalable vector type for the result element count/size These are fixed with this patch and tested in Sema/AArch64/sve-vector-conditional-op.cpp. Fixes llvm#169025
…lvm#165714) Adds initial support for GPU by-ref reductions. The main problem for reduction by reference is that, prior to this PR, we were shuffling (from remote lanes within the same warp or across different warps within the block) pointers/references to the private reduction values rather than the private reduction values themselves. In particular, this diff adds support for reductions on scalar allocatables where reductions happen on loops nested in `target` regions. For example: ```fortran integer :: i real, allocatable :: scalar_alloc allocate(scalar_alloc) scalar_alloc = 0 !$omp target map(tofrom: scalar_alloc) !$omp parallel do reduction(+: scalar_alloc) do i = 1, 1000000 scalar_alloc = scalar_alloc + 1 end do !$omp end target ``` This PR supports by-ref reductions on the intra- and inter-warp levels. So far, there are still steps to be takens for full support of by-ref reductions, for example: * Support inter-block value combination is still not supported. Therefore, `target teams distribute parallel do` is still not supported. * Support for dynamically-sized arrays still needs to be added. * Support for more than one allocatable/array on the same `reduction` clause.
…otations (llvm#169620) Refactored GSL pointer and owner type detection functions to improve code organization and reusability.
Fix the assertion failure after llvm#164798. The issue is that the comparison `Sizes.back() == ElementSize` can fail when their types are different. We should cast them to the wider type before the comparison.
Split off from PR llvm#163525, this standalone patch replaces almost all the remaining cases where undef is used as value in loop vectoriser tests. This will reduce the likelihood of contributors hitting the `undef deprecator` warning in github. NOTE: The remaining use of undef in iv_outside_user.ll will be fixed in a separate PR. I've removed the test stride_undef from version-mem-access.ll, since there is already a stride_poison test.
…lvm#169076) This commit improves the handling of GetElementPtr (GEP) instructions for Logical SPIR-V. It includes: - Rewriting of GEPs that are not allowed in Logical SPIR-V (specifically, handling non-zero first indices by rebuilding access chains or adjusting types). - Better deduction of element types for pointer casting. - Updates to instruction selection to ensure GEPs are correctly lowered to OpAccessChain or OpInBoundsAccessChain only when valid (e.g. first index 0). - Support for standard HLSL cbuffer layouts in tests.
ronlieb
approved these changes
Nov 26, 2025
Collaborator
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.