forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
z1-cciauto
merged 20 commits into
amd-staging
from
amd/merge/upstream_merge_20251124093421
Nov 24, 2025
Merged
merge main into amd-staging #666
z1-cciauto
merged 20 commits into
amd-staging
from
amd/merge/upstream_merge_20251124093421
Nov 24, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…texpr test - still tested _mm_cmpeq_epi8 (llvm#169311)
Current implementation for SV_Position was very basic to allow implementing/testing some semantics. Now that semantic support is more robust, I can move forward and implement the whole semantic logic. DX part is still a bit placeholder.
The attributor can infer the alignment of %p at the call-site in this example [1]: ``` define void @f(ptr align 8 %p, i1 %c1, i1 %c2) { entry: br i1 %c1, label %bb.1, label %exit bb.1: call void (...) @llvm.fake.use(ptr %p) br label %exit exit: ret void } ``` but not when there's an additional conditional branch: ``` define void @f(ptr align 8 %p, i1 %c1, i1 %c2) { entry: br i1 %c1, label %bb.1, label %exit bb.1: br i1 %c2, label %bb.2, label %exit bb.2: call void (...) @llvm.fake.use(ptr %p) br label %exit exit: ret void } ``` unless `-attributor-annotate-decl-cs` is enabled. This patch extends `followUsesInMBEC` to handle such recursive branches. n.b. admittedly I wrote this patch before discovering inferring the alignment in this example is already possible with `-attributor-annotate-decl-cs`, I came to realise this once writing the tests, but this seems like a gap regardless looking at existing FIXMEs, plus the alignment can now be inferred in this particular example without the flag. [1] https://godbolt.org/z/aKoc75so5
) This patch fixes a crash in Clang that occurs when the compiler retrieves the element type of a complex type but receives a sugared type. See example here: https://godbolt.org/z/cdbdeMcaT This patch fixes the crash.
Extend the load of a expand shape rewrite pattern to support folding a `memref.expand_shape` and `vector.transfer_read` when the permutation map on `vector.transfer_read` is a minor identity. --------- Signed-off-by: Jack Frankland <jack.frankland@arm.com>
Introduce `AVX512_128_SETALLONES`, `AVX512_256_SETALLONES` pseudos to generate all-ones vectors. Post-RA expansion: - Use VEX vpcmpeqd for XMM/YMM0–15 when available (matches current codegen as `AVX512_128/256_SETALLONES` will be preferred over `AVX1/2_SETALLONES` for AVX512VL target). - Use EVEX `vpternlogd imm=0xFF` for high regs. Includes MIR tests for both VEX and EVEX paths.
…lates. (llvm#168946) Reduces the pain of manual editing tests applying the same changes over multiple instructions and keeping them consistent.
This patch adds unary nodes plus and minus, introduces unary type conversions, and adds integral promotion to the type system.
…wards branches (llvm#168398) If we have a conditional branch, followed by an epilogue, followed by more code, LLDB will incorrectly compute unwind information through instruction emulation. Consider this: ``` // ... <+16>: b.ne ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE // epilogue start <+20>: ldp x29, x30, [sp, #0x20] <+24>: add sp, sp, #0x30 <+28>: ret // epilogue end AFTER_EPILOGUE: <+32>: do something // ... <+48>: ret DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE: <+52>: stp x22, x23, [sp, #0x10] <+56>: mov x22, #0x1 <+64>: b ; <+32> AFTER_EPILOGUE ``` LLDB will think that the unwind state of +32 is the same as +28. This is false, as +32 _never_ executes after +28. The root cause of the problem is the order in which instructions are visited; they are visited in the order they appear in the text, with unwind state always being forwarded to positive branch offsets, but never to negative offsets. In the example above, `AFTER_EPILOGUE` should inherit the state of the branch in +64, but it doesn't because `AFTER_EPILOGUE` is visited right after the `ret` in +28. Fixing this should be simple: maintain a stack of instructions to visit. While the stack is not empty, take the next instruction on stack and visit it. * After visiting a non-branching instruction, push the next instruction and forward unwind state to it. * After visiting a branch with one or more known targets, push the known branch targets and forward state to them. * In all other cases (ret, or branch to register), don't push nor forward anything. Never push an instruction already on the stack. Like the algorithm today, this new algorithm also assumes that, if two instructions branch to the same target, the unwind state in both better be the same. (Note: yes, branch to register is also handled incorrectly today, and will still be incorrect).
This patch implements the lowering for the 'copy' clause for a function-local declare directive. This is the first of the clauses that requires a 'cleanup' step, so it also includes some basic infrastructure for that. Fortunately there are only 8 clauses (only 6 of which require cleanup), so the if/else chain won't get too long. Also fortunately, we don't have to include any of the AST components, as it is possible to tell all the required details from the entry operation itself.
`[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
When interleaving a loop with an early exit, the parts before the active lane will be all zero. Currently we emit @llvm.experimental.cttz.elts with ZeroIsPoison=true for these parts, which means that they will produce poison. We don't see any miscompiles today on AArch64 because it has the same lowering for cttz.elts regardless of ZeroIsPoison, but this may cause issues on RISC-V when interleaving. This fixes it by setting ZeroIsPoison=false. The codegen is slightly worse on RISC-V when ZeroIsPoison=false and we could potentially recover it by enabling it again when UF=1, but this is left to another PR. This is split off from llvm#168738, where LastActiveLane can get expanded to a FirstActiveLane with an all-zeroes mask.
More missed target checks. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
This is exactly like the 'copy', except the exit operation is a 'delete' instead of a 'copyout'. Also, creating the 'delete' op has one less argument to it, so we have to do some special handling when creating that.
Collaborator
dpalermo
approved these changes
Nov 24, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.