merge main into amd-staging #562

ronlieb · 2025-11-11T13:44:11Z

No description provided.

…w clangOptions library" (llvm#167374) This relands llvm#167348. The original PR was reverted due to a reported build failure, which was later diagnosed as a local issue in the developer’s checkout or build state. See discussion here: llvm#163659 (comment) No additional changes have been made in this reland.

This isn't reachable today but will come into play once we reorder passes for llvm#147352 and llvm#147351. Note that the `CBufferRowIntrin` helper struct is copied from the `DXILCBufferAccess` pass, but it will be removed from there when we simplify that pass in llvm#147351

trans bf16 insts do not support clamp and omod

)

…m#167276) Preparing to moving most of implementation out of the header file. * llvm#167280 --------- Co-authored-by: Naveen Seth Hanig <naveen.hanig@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…lvm#167383) Re-land: Add test for Complex imag literal GNU extension after updating the name

…#152830) This adds the codegen support for the dyn_groupprivate clause.

It is undefined behavior to call a function with a mismatched calling convention. Rather than crash on this behavior, it should compile. This LLVM defect was identified via the AMD Fuzzing project.

…166658) (llvm#166846) The previous implementation in llvm#166252 (rolled back in llvm#166658) caused buildbot failures due to a bug in an added test. The modified `UseAfterClose` did not pass a `struct flock` value to `fcntl`: https://github.com/llvm/llvm-project/blob/2d5170594147b42a37698760d6e0194eec4f1390/libc/test/src/fcntl/fcntl_test.cpp#L175 Which ASAN caught and errored in the `fcntl` implementation when the unspecified argument was accessed: https://github.com/llvm/llvm-project/blob/c12cb2892c808af459eaa270b8738a2b375ecc9b/libc/src/__support/OSUtil/linux/fcntl.cpp#L59

Fixes regressions with llvm#159493 after 476a6ea

Use getDefiningRecipe to future-proof the code. Split off from llvm#156262 as suggested.

This commit introduces `SpecialCaseList::Match`, a small struct to hold the matched rule and its line number. This simplifies the `match` methods by allowing them to return a single value instead of using a callback. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Make the systemSupportsMemoryTagging() function return even on system that don't support memory tagging. This avoids the need to always check if memory tagging is supported before calling th function. Make systemSupportsMemoryTagging() cache the getauxval return value instead of calling the function every time. Updated the code that calls systemSupportsMemoryTagging().

This hardens the code to check based on WideMember0's operands. This ensures each call will go through the same check. Should be NFC currently but needed when generalizing in follow-up patches.

…ctive as noop. (llvm#167359)

) Refactor computeWidth from CUFOpConversion into a shared helper function computeElementByteSize in CUFCommon.

…m#153608) This should be the peephole's job. Because and sets V flag to 0, this is why signed comparisons with 0 are okay to replace with tst. Note this is only for AArch64, because ANDS on ARM leaves the V flag the same. Fixes: llvm#154387

… of pattern matching (llvm#166806) Flang alias analysis used to find allocation site by pattern matching allocation ops in mainly FIR dialect. This MR extends the characterization to instead characterize based on whether the result of an op has MemAlloc effect.

After llvm#164165, we emit warnings from non-system headers by default. This change only preserves functionality of `clang-tidy` as it was before the change.

…m#158224) Both conceptually belong to the same subtarget, so it should not be necessary to pass in the context TargetRegisterInfo to any TargetInstrInfo member. Add this reference so those superfluous arguments can be removed. Most targets placed their TargetRegisterInfo as a member in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo, so unify all targets to look the same.

Will help in future patches.

The save stats test would fail downstream due to the tests being run from a read only directory. The cwd flag would attempt to place the statistics in that read only directory, causing an error. This patch ensures that the tests are always run from a temp directory.

llvm#167350) fwrite and friends don't modify errno if no error occurred. Therefore frite_unlocked's return value shouldn't be constructed from errno without checking if an error actually occurred. This fixes an error introduced by llvm@9e2f73f

This is merging loads and stores so use the combined DebugLoc. Not sure if computeBase should be using the merged location from all the involved instructions. I'm also not sure how to test this sort of thing.

… names (llvm#167293) We shouldn't be calling `getUnversionedName` unconditionally because DWARFv6-enabled Clang will emit versioned language names that are versioned (i.e., `sourceLanguageName`...see new test case). We need to check `hasVersionedName` and pick the appropriate API based on that. Otherwise we assert in `getUnversionedName`.

When transforming floating-point induction variables into integer ones, make sure we stay within the bounds of fp values that can be represented as integers without gaps, i.e., 2^24 and 2^53 for IEEE-754 single and double precision respectively (both on negative and positive side). Fixes: llvm#166496.

…6395) So we know before _what_ entry in the chain we need to look for the InitList. Fixes llvm#166171

This means that VPExpressions will now be constructed for VPPartialReductionRecipe's when the loop has tail-folding predication. Note that control-flow (if/else) predication is not yet handled for partial reductions, because of the way partial reductions are recognised and built up.

…nux) (llvm#165451) This patch reimplements the SME ABI `__arm_za_disable` routine within libunwind. This routine must be called before resuming from unwinding on AArch64 platforms with SME support. Before calling the routine, we need to check that SME is available. In this patch, this is implemented for Linux-based platforms by checking HWCAP2. It should be possible to implement this check for other platforms as required. This patch includes a test for this functionality. This test requires SME, so on platforms without it, it will simply pass.

llvm#161635) `aligned_storage` has been deprecated and will most likely be removed in a future version of C++. This patch removes some of its uses to avoid having to work around its removal in the future.

…lvm#164377)

llvm#167255) Patch updates transform.memref.erase_dead_alloc_and_stores to not delete escaped allocations.

The original problem does not reproduce anymore, but add the test case. Fixes llvm#163563

This allows us to strip a special case in VPWidenGEP::execute.

This patch adds cost model tests for `bfloat` operations with `+sve-b16b16`. Currently, some of these costs are higher than they should be as the cost model is assuming `bfloat`s need promotion, but some of these operations are natively supported with `+sve-b16b16`.

After llvm#167359 / 95db31e. A fix was attempted in llvm#167423 but was not quite enough. From what I could understand, in v1 format you have to specify all the basic blocks. Where before !call_me implied they were all cold (I think, very shaky understanding here). For this test we want to see blocks like call_me/foo/call_me. So adding a line for block 1 fixes the tests. It could produce more blocks at some point but I think as long as foo is within two of them, it'll be fine.

… into new clangOptions library" (llvm#167374)" This reverts commit f63d33d.

) Currently the only way to enable the use of wide active lane masks is to pass -enable-wide-lane-mask and force both interleaving & tail-folding with additional flags. This patch changes selectInterleaveCount to consider interleaving if wide lane masks were requested, although the feature remains off by default.

…se the load is hidden behind a bitcast (llvm#167491)

) This patch proposes replacing the inline assembly with an intrinsic and adds the `noreturn` to avoid warnings when the caller is also `noreturn`.

Adds `transform.xegpu.set_gpu_launch_threads` that overrides `gpu.launch` operation threads.

z1-cciauto · 2025-11-11T13:46:34Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2768

naveen-seth and others added 30 commits November 10, 2025 21:24

X86: Enable terminal rule (llvm#165957)

793ab6a

[LLDB] Fix darwin shell tests under ASAN

2fc2e1f

ARM: Enable terminal rule (llvm#165958)

f8e9723

AArch64: Enable terminal rule (llvm#165959)

2aa629d

[AMDGPU] remove clamp and omod for trans bf16 insts (llvm#165819)

067f155

trans bf16 insts do not support clamp and omod

[Hexagon] Clean-up Instrprof test (llvm#166990)

826cadd

[bazel][clang] Port llvm#167374: split clang options/driver (llvm#167387

17e2641

)

[CIR][NFC] Re-land: Add test for Complex imag literal GNU extension (l…

b4a6151

…lvm#167383) Re-land: Add test for Complex imag literal GNU extension after updating the name

[OpenMP][Clang] Add codegen support for dyn_groupprivate clause (llvm…

540250c

…#152830) This adds the codegen support for the dyn_groupprivate clause.

[LLDB] Fix (more) darwin shell tests under ASAN

20e1a12

[AMDGPU] Remove calling conv check on entry function (llvm#162080)

fb2fa21

It is undefined behavior to call a function with a mismatched calling convention. Rather than crash on this behavior, it should compile. This LLVM defect was identified via the AMD Fuzzing project.

AMDGPU/GlobalISel: Fix AGPR regbank check for mfma_scale (llvm#167393)

8c86bc8

Fixes regressions with llvm#159493 after 476a6ea

[VPlan] Use getDefiningRecipe instead of directly accessing Def. (NFC)

0767c64

Use getDefiningRecipe to future-proof the code. Split off from llvm#156262 as suggested.

[VPlan] Update canNarrowLoad to check WidenMember0's op first (NFCI).

8b1cc2d

This hardens the code to check based on WideMember0's operands. This ensures each call will go through the same check. Should be NFC currently but needed when generalizing in follow-up patches.

Treat specifying a function in the bbsection profile without any dire…

95db31e

…ctive as noop. (llvm#167359)

[flang][CUDA] Unify element size computation in CUF helpers (llvm#167398

d5125b3

) Refactor computeWidth from CUFOpConversion into a shared helper function computeElementByteSize in CUFCommon.

Add default empty header filter regex to root .clang-tidy (llvm#167386)

ad9eb0d

After llvm#164165, we emit warnings from non-system headers by default. This change only preserves functionality of `clang-tidy` as it was before the change.

[NFC][SpecialCaseList] Refactor error handling (llvm#167277)

da996a3

Will help in future patches.

AMDGPU: Use getMergedLocation in SILoadStoreOptimizer (llvm#156396)

e5e74e9

This is merging loads and stores so use the combined DebugLoc. Not sure if computeBase should be using the merged location from all the involved instructions. I'm also not sure how to test this sort of thing.

Michael137 and others added 22 commits November 11, 2025 09:29

[IndVarSimplify] Precommit tests for PR166649 (NFC)

9100001

[clang][bytecode] Mark CXXDefaultInitExprs in InitLink chain (llvm#16…

8e3188a

…6395) So we know before _what_ entry in the chain we need to look for the InitList. Fixes llvm#166171

[libc++] Remove some of the uses of aligned_storage inside the library (

c0bca9c

llvm#161635) `aligned_storage` has been deprecated and will most likely be removed in a future version of C++. This patch removes some of its uses to avoid having to work around its removal in the future.

[libc++][NFC] Make the exception implementation files self-contained (l…

dacd2f9

…lvm#164377)

[mlir]: Add handling of escaped memrefs to erase_dead_alloc_and_stores (

0400b9a

llvm#167255) Patch updates transform.memref.erase_dead_alloc_and_stores to not delete escaped allocations.

[clang][bytecode] Add a C test case (llvm#167484)

c3c4a88

The original problem does not reproduce anymore, but add the test case. Fixes llvm#163563

[VPlan] Handle WidenGEP in narrowToSingleScalars (llvm#166740)

fdd52f5

This allows us to strip a special case in VPWidenGEP::execute.

merge main into amd-staging

b1b1257

Revert "Reland "[clang] Refactor option-related code from clangDriver…

3a73595

… into new clangOptions library" (llvm#167374)" This reverts commit f63d33d.

[flang-rt] Use dlsym to access char** environ on FreeBSD (llvm#158477)

7a73e69

[X86] bittest-big-integer.ll - add test showing missed RMW fold becau…

6e5f277

…se the load is hidden behind a bitcast (llvm#167491)

merge main into amd-staging

fa81402

[Clang][CUDA] Replace inline asm in __trap() with intrinsic (llvm#166152

b440fb7

) This patch proposes replacing the inline assembly with an intrinsic and adds the `noreturn` to avoid warnings when the caller is also `noreturn`.

[MLIR][XeGPU][TransformOps] Add set_gpu_launch_threads op (llvm#166865)

300750d

Adds `transform.xegpu.set_gpu_launch_threads` that overrides `gpu.launch` operation threads.

merge main into amd-staging

bcb8544

ronlieb requested review from a team and dpalermo November 11, 2025 13:44

ronlieb requested review from nicolasvasilache and stellaraccident as code owners November 11, 2025 13:44

dpalermo approved these changes Nov 11, 2025

View reviewed changes

z1-cciauto merged commit c2e90ff into amd-staging Nov 11, 2025
18 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251111061038 branch November 11, 2025 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #562

merge main into amd-staging #562

Uh oh!

ronlieb commented Nov 11, 2025

Uh oh!

z1-cciauto commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

62 participants

merge main into amd-staging #562

merge main into amd-staging #562

Uh oh!

Conversation

ronlieb commented Nov 11, 2025

Uh oh!

z1-cciauto commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

62 participants