merge main into amd-staging #666

ronlieb · 2025-11-24T15:47:18Z

No description provided.

…texpr test - still tested _mm_cmpeq_epi8 (llvm#169311)

Current implementation for SV_Position was very basic to allow implementing/testing some semantics. Now that semantic support is more robust, I can move forward and implement the whole semantic logic. DX part is still a bit placeholder.

)

@f

The attributor can infer the alignment of %p at the call-site in this example [1]: ``` define void @f(ptr align 8 %p, i1 %c1, i1 %c2) { entry: br i1 %c1, label %bb.1, label %exit bb.1: call void (...) @llvm.fake.use(ptr %p) br label %exit exit: ret void } ``` but not when there's an additional conditional branch: ``` define void @f(ptr align 8 %p, i1 %c1, i1 %c2) { entry: br i1 %c1, label %bb.1, label %exit bb.1: br i1 %c2, label %bb.2, label %exit bb.2: call void (...) @llvm.fake.use(ptr %p) br label %exit exit: ret void } ``` unless `-attributor-annotate-decl-cs` is enabled. This patch extends `followUsesInMBEC` to handle such recursive branches. n.b. admittedly I wrote this patch before discovering inferring the alignment in this example is already possible with `-attributor-annotate-decl-cs`, I came to realise this once writing the tests, but this seems like a gap regardless looking at existing FIXMEs, plus the alignment can now be inferred in this particular example without the flag. [1] https://godbolt.org/z/aKoc75so5

) This patch fixes a crash in Clang that occurs when the compiler retrieves the element type of a complex type but receives a sugared type. See example here: https://godbolt.org/z/cdbdeMcaT This patch fixes the crash.

Extend the load of a expand shape rewrite pattern to support folding a `memref.expand_shape` and `vector.transfer_read` when the permutation map on `vector.transfer_read` is a minor identity. --------- Signed-off-by: Jack Frankland <jack.frankland@arm.com>

)

Introduce `AVX512_128_SETALLONES`, `AVX512_256_SETALLONES` pseudos to generate all-ones vectors. Post-RA expansion: - Use VEX vpcmpeqd for XMM/YMM0–15 when available (matches current codegen as `AVX512_128/256_SETALLONES` will be preferred over `AVX1/2_SETALLONES` for AVX512VL target). - Use EVEX `vpternlogd imm=0xFF` for high regs. Includes MIR tests for both VEX and EVEX paths.

…lates. (llvm#168946) Reduces the pain of manual editing tests applying the same changes over multiple instructions and keeping them consistent.

This patch adds unary nodes plus and minus, introduces unary type conversions, and adds integral promotion to the type system.

…wards branches (llvm#168398) If we have a conditional branch, followed by an epilogue, followed by more code, LLDB will incorrectly compute unwind information through instruction emulation. Consider this: ``` // ... <+16>: b.ne ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE // epilogue start <+20>: ldp x29, x30, [sp, #0x20] <+24>: add sp, sp, #0x30 <+28>: ret // epilogue end AFTER_EPILOGUE: <+32>: do something // ... <+48>: ret DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE: <+52>: stp x22, x23, [sp, #0x10] <+56>: mov x22, #0x1 <+64>: b ; <+32> AFTER_EPILOGUE ``` LLDB will think that the unwind state of +32 is the same as +28. This is false, as +32 _never_ executes after +28. The root cause of the problem is the order in which instructions are visited; they are visited in the order they appear in the text, with unwind state always being forwarded to positive branch offsets, but never to negative offsets. In the example above, `AFTER_EPILOGUE` should inherit the state of the branch in +64, but it doesn't because `AFTER_EPILOGUE` is visited right after the `ret` in +28. Fixing this should be simple: maintain a stack of instructions to visit. While the stack is not empty, take the next instruction on stack and visit it. * After visiting a non-branching instruction, push the next instruction and forward unwind state to it. * After visiting a branch with one or more known targets, push the known branch targets and forward state to them. * In all other cases (ret, or branch to register), don't push nor forward anything. Never push an instruction already on the stack. Like the algorithm today, this new algorithm also assumes that, if two instructions branch to the same target, the unwind state in both better be the same. (Note: yes, branch to register is also handled incorrectly today, and will still be incorrect).

Fixes llvm#169297

This patch implements the lowering for the 'copy' clause for a function-local declare directive. This is the first of the clauses that requires a 'cleanup' step, so it also includes some basic infrastructure for that. Fortunately there are only 8 clauses (only 6 of which require cleanup), so the if/else chain won't get too long. Also fortunately, we don't have to include any of the AST components, as it is possible to tell all the required details from the entry operation itself.

`[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant

When interleaving a loop with an early exit, the parts before the active lane will be all zero. Currently we emit @llvm.experimental.cttz.elts with ZeroIsPoison=true for these parts, which means that they will produce poison. We don't see any miscompiles today on AArch64 because it has the same lowering for cttz.elts regardless of ZeroIsPoison, but this may cause issues on RISC-V when interleaving. This fixes it by setting ZeroIsPoison=false. The codegen is slightly worse on RISC-V when ZeroIsPoison=false and we could potentially recover it by enabling it again when UF=1, but this is left to another PR. This is split off from llvm#168738, where LastActiveLane can get expanded to a FirstActiveLane with an all-zeroes mask.

…8410)

More missed target checks. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>

This is exactly like the 'copy', except the exit operation is a 'delete' instead of a 'copyout'. Also, creating the 'delete' op has one less argument to it, so we have to do some special handling when creating that.

z1-cciauto · 2025-11-24T15:49:05Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2949

RKSimon and others added 20 commits November 24, 2025 12:32

[X86] avx2-builtins.c - fix copy+paste typo in _mm256_cmpeq_epi8 cons…

72bfa28

…texpr test - still tested _mm_cmpeq_epi8 (llvm#169311)

[mlir] Avoid else after return in ScalableValueBounds (NFC) (llvm#169211

e4cff3c

)

[X86][NFC] Add -show-mc-encoding to check register misuse (llvm#169264

a27842c

)

[Utils][update_mc_test_checks] Support generating asm tests from temp…

83765f4

…lates. (llvm#168946) Reduces the pain of manual editing tests applying the same changes over multiple instructions and keeping them consistent.

[LLDB] Add unary plus and minus to DIL (llvm#155617)

d5927a6

This patch adds unary nodes plus and minus, introduces unary type conversions, and adds integral promotion to the type system.

[llvm][utils][lit] Fix imports in ManyTests.py example (llvm#169328)

4a567e3

Fixes llvm#169297

[libc++][forward_list] Applied [[nodiscard]] (llvm#169019)

ceea07d

`[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant

[AArch64] Update costs for fshl/r and add rotr/l variants. NFC

1580f4b

AMDGPU/GlobalISel: Combine S16 copy-trunc-readanylane-anyext (llvm#16…

ad0acf4

…8410)

[OpenMP][SPIRV] Disable exceptions for OpenMP SPIR-V (llvm#169094)

71952df

More missed target checks. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>

[OpenACC][CIR] copyin lowering for func-local- declare (llvm#169336)

d542dce

This is exactly like the 'copy', except the exit operation is a 'delete' instead of a 'copyout'. Also, creating the 'delete' op has one less argument to it, so we have to do some special handling when creating that.

merge main into amd-staging

c08f05e

ronlieb requested review from a team and dpalermo November 24, 2025 15:47

ronlieb requested review from Groverkss and nicolasvasilache as code owners November 24, 2025 15:47

dpalermo approved these changes Nov 24, 2025

View reviewed changes

ronlieb requested review from a team and removed request for Groverkss and nicolasvasilache November 24, 2025 17:18

z1-cciauto merged commit 32c812f into amd-staging Nov 24, 2025
14 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251124093421 branch November 24, 2025 18:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #666

merge main into amd-staging #666

Uh oh!

ronlieb commented Nov 24, 2025

Uh oh!

z1-cciauto commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

22 participants

merge main into amd-staging #666

merge main into amd-staging #666

Uh oh!

Conversation

ronlieb commented Nov 24, 2025

Uh oh!

z1-cciauto commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

22 participants