merge main into amd-staging #503

ronlieb · 2025-11-05T21:50:49Z

No description provided.

Update tests to contain auto generated checks.

PR llvm#165993 accidentally broke the lowering of the `test.wait` Op. This patch fixes the issue and adds tests to verify the lowering to intrinsics for all mbarrier Ops, ensuring similar regressions are caught in the future. Additionally, the `cp-async-mbarrier` test is moved to the `mbarriers.mlir` test file to keep all related tests together. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>

llvm#166536) …e size is unknown Keep _negative suffix only for test cases when the size is negative

… checks (llvm#148810) This PR adds support for the NOTIFY specifier in the image selector as described in the 2023 standard, and add checks for the NOTIFY_TYPE type.

…ional (llvm#166032) This picks up from llvm#166028, making the `Function` argument optional: most cases don't need to provide it, but in e.g. InstCombine's case, where the instruction (select, branch) is not attached to a function yet, the function needs to be passed explicitly. Co-authored-by: Florian Hahn <flo@fhahn.com>

…lvm#166078) In the following example, `Functor::method()` inappropriately triggers a diagnostic that `outer()` is blocking by allocating memory. ``` void outer() [[clang::nonblocking]] { struct Functor { int* ptr; void method() { ptr = new int; } }; } ``` --------- Co-authored-by: Doug Wyatt <dwyatt@apple.com>

@ind4

…m#155630) When there's a deep inheritance hierarchy of multiple C++ classes (see below), then the mangled name of a VFTable can include multiple key nodes in the target name. For example, in the following code, MSVC will generate mangled names for the VFTables that have up to three key classes in the context. <details><summary>Code</summary> ```cpp class Base1 { virtual void a() {}; }; class Base2 { virtual void b() {} }; class Ind1 : public Base1 {}; class Ind2 : public Base1 {}; class A : public Ind1, public Ind2 {}; class Ind3 : public A {}; class Ind4 : public A {}; class B : public Ind3, public Ind4 {}; class Ind5 : public B {}; class Ind6 : public B {}; class C : public Ind5, public Ind6 {}; int main() { auto i = new C; } ``` </details> This will include `??_7C@@6BInd1@@ind4@@ind5@@@` (and every other combination). Microsoft's undname will demangle this to "const C::\`vftable'{for \`Ind1's \`Ind4's \`Ind5'}". Previously, LLVM would demangle this to "const C::\`vftable'{for \`Ind1'}". With this PR, the output of LLVM's undname will be identical to Microsoft's version. This changes `SpecialTableSymbolNode::TargetName` to a node array which contains each key from the name. Unlike namespaces, these keys are not in reverse order - they are in the same order as in the mangled name.

This patch fixes: polly/lib/Transform/ScheduleOptimizer.cpp:935:17: error: unused variable 'File' [-Werror,-Wunused-variable] polly/lib/Transform/ScheduleOptimizer.cpp:936:9: error: unused variable 'Line' [-Werror,-Wunused-variable] polly/lib/Transform/ScheduleOptimizer.cpp:937:17: error: unused variable 'Msg' [-Werror,-Wunused-variable]

Reviewers: Pull Request: llvm#166587

…166505) This patch aligns llvm::to_address with C++20 std::to_address by adding a static_assert to prevent instantiation with function types. The C++20 standard says that std::to_address is ill-formed on a function type.

These two functions are decalred in Hexagon.h. Identified with readability-redundant-declaration.

In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.

…_fadd_v2f16 (llvm#166547) We didn't remove the `t` for this builtin in the past due to not being sure if we should use `float16/half`. This patch doesn't fix the _Float16/half question, I'll address that in a separate patch later (after discussing the options on our weekly meeting). At the moment we maintain the `h` for this builtin (which is likely not what we want for HIP).

…llvm#166373) Since 022e782 (2017) this attribute has an effect on both aarch64 and x86_64; update the docs to reflect this.

) GISel no longer falls back onto SDAG when attempting to lower the pmull and pmull64 intrinsics.

…66350) This patch refactors the logic to define each component of the `math_errhandling` macro. It assumes that math error handling is supported by the target and the C library unless otherwise disabled in the preprocessor logic. In addition to the refactoring, the support for error handling via exceptions is explicitly disabled for Arm targets with no FPU, that is, where `__ARM_FP` is not defined. This is because LLVM libc does not provide a floating-point environment for Arm no-FP configurations (or at least one with support for FP exceptions).

…m#166527) This avoids unintentional comparisons between `SourceLanguage` and `LanguageType`. Also marks `operator bool` explicit so we don't implicitly convert to bool.

…lvm#164987) Introduce a new class for the TargetLowering usage. This tracks the subtarget specific lowering decisions for which libcall to use. RuntimeLibcallsInfo is a module level property, which may have multiple implementations of a particular libcall available. This attempts to be a minimum boilerplate patch to introduce the new concept. In the future we should have a tablegen way of selecting which implementations should be used for a subtarget. Currently we do have some conflicting implementations added, it just happens to work out that the default cases to prefer is alphabetically first (plus some of these still are using manual overrides in TargetLowering constructors).

…mpl (llvm#166585) This is unused and will not make sense.

Implement `xvrlw`.

…db to re-use it (llvm#165323) lldb's CPlusPlusNameParser is currently identifying keywords using it's own map implemented using clang/Basic/TokenKinds.def. However, it does not respect the language options so identifiers can incorrectly determined to be keywords when using languages in which they are not keywords. Rather than implement the logic to enable/disable keywords in both projects it makes sense for lldb to use clang's implementation. See llvm#164284 for more information

The VM_MEMORY_SANITIZER constant was added in macOs 10.15 and friends. Support using the constant on older OSes. Fixes llvm#156144

…e. (llvm#166225) Functions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5). Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly.

…6234) This is a counterpart of llvm#166225 but for wctype_utils (which are not yet widely used). For now, I'm just changing the types from wint_t to wchar_t to match the regular ctype_utils change. The next change may rename most of the functions to match the name of ctype_utils variants, so that we could be calling them from the templated code operating on "const char*" and "const wchar_t*" strings, and the right function signature would be picked up.

…6477)

Use `IfDefEmitter` and `NamespaceEmitter` in SDNodeInfoEmitter.

Users may have multiple devices and would like to resolve the homepath based on the machine they are on. expands the tilde `~` character at the front of the given file path.

This patch defines errno unit and integration test asserts as noop on GPU targets. Checking for errnos is tests has caused build breakages in previous patches.

…166471) If the input to LowerBufferFatPointers is such that the resource- and offset-specific `select` instructions generated for a `select` on `ptr addrspae(7)` fold away, the pass would crash when trying to replace an instruction with itself. This commit resolves the issue. Fixes iree-org/iree#22551

…ers (llvm#166593)

…llvm#154237) targets This patch enables support of the NV (non-volatile) bit in FLAT instructions in GFX9 (pre-GFX90A) targets.

The brace wrapping for Java records should now behave similar to classes. Before, opening braces for Java records were always placed in the same line as the record definition.

…lvm#166583) This information will be needed in more emitters, so start factoring it to be more reusable.

…6391) `R_AARCH64_ADD_ABS_LO12_NC` is for the `ADD` instruction in the `ADRP+ADD` sequence. For `ADRP+LDR` sequence generated in LDR relaxation, relocation type for `LDR` should be `R_AARCH64_LDST64_ABS_LO12_NC` if it is 64-bit integer load or `R_AARCH64_LDST32_ABS_LO12_NC` if 32-bit. Sorry should have included this in llvm#165787.

llvm#165956) Continued from llvm#165804 This maintains the BFI of the default branch. Originally `10/63`, post-pass, it ends up being `5/63 + 58/63 * 5/58`(first term is from `PROF`, second is the probability of going to the switch lookup times the probability, there, of taking the default branch) Issue llvm#147390

Cost of interleaved store of 8 factor and 16 factor are cheaper in AArch64 with additional interleave instructions.

…oned/unversioned selector (llvm#164507) We don't have sufficient information to know when the versioned (or unversioned) loop variant will be taken, so we mark the branch as having "unknown" probabilities. Issue llvm#147390

Enumeration of relocation types is not always sequential, e.g. on AArch64 the first real relocation type is 0x101. As such, the existing code in `Relocation::print()` was crashing while printing AArch64 relocations. Fix it by using `llvm::object::getELFRelocationTypeName()`.

Add additional tests for narrowing interleave groups with casts.

z1-cciauto · 2025-11-05T21:51:35Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2672

lei137 and others added 30 commits November 5, 2025 10:24

[PowerPC][NFC] auto gen checks vec rounding tests (llvm#166435)

3a84aef

Update tests to contain auto generated checks.

[BOLT][NFC] Rename funtions with _negative suffix to _unknown when th… (

338fb02

llvm#166536) …e size is unknown Keep _negative suffix only for test cases when the size is negative

[flang] Adding NOTIFY specifier in image selector and add notify type…

87fb7b0

… checks (llvm#148810) This PR adds support for the NOTIFY specifier in the image selector as described in the 2023 standard, and add checks for the NOTIFY_TYPE type.

[gn] port bb4ed55

4334b43

[NFC][TableGen] Adopt NamespaceEmitter in DirectiveEmitter (llvm#165600)

d568601

Fix failures introduced in llvm#166032 (llvm#166574)

ff108f7

[gn build] Port 3700587

a796d18

[gn build] Port 3ebed51

ef6947b

[gn build] Port 51d0f6d

9bb67f8

[gn build] Port 718a3b2

3bf0ce1

[gn build] Port dd14eb8

ce5dac6

[CI][NFC] Reformat Python Files in .ci directory

0b5a00a

Reviewers: Pull Request: llvm#166587

[Support] Simplify minIntN and isUIntN (NFC) (llvm#166506)

ab02808

[Hexagon] Remove redundant declarations (NFC) (llvm#166507)

0b29c3c

These two functions are decalred in Hexagon.h. Identified with readability-redundant-declaration.

[ObjectYAML] Remove redundant declarations (NFC) (llvm#166508)

aea75d0

In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.

[llvm] Proofread GoldPlugin.rst (llvm#166509)

d7c1df3

[AMDGPU] Autogenerate R600 packetizer checks (llvm#166570)

3154a84

[clang] [doc] Document that the ms_abi attribute works on aarch64 too (…

d4e3a23

…llvm#166373) Since 022e782 (2017) this attribute has an effect on both aarch64 and x86_64; update the docs to reflect this.

[AArch64][GlobalISel] Added pmull/pmull64 intrinsic support (llvm#165740

95c8750

) GISel no longer falls back onto SDAG when attempting to lower the pmull and pmull64 intrinsics.

[lldb] Mark single-argument SourceLanguage constructors explicit (llv…

9b1719e

…m#166527) This avoids unintentional comparisons between `SourceLanguage` and `LanguageType`. Also marks `operator bool` explicit so we don't implicitly convert to bool.

RuntimeLibcalls: Remove LibcallLoweringPredicate from RuntimeLibcallI…

dd88923

…mpl (llvm#166585) This is unused and will not make sense.

lei137 and others added 23 commits November 5, 2025 13:22

[PowerPC] Implement vsx rotate left word instr (llvm#160754)

ebeb36b

Implement `xvrlw`.

[debugserver] Fix debugserver build on < macOS 10.15 (llvm#166599)

bc55f4f

The VM_MEMORY_SANITIZER constant was added in macOs 10.15 and friends. Support using the constant on older OSes. Fixes llvm#156144

[gn build] Port 056d2c1

c193eea

[NFC][LLVM][IR] Cleanup namespace usage in LLVM IR cpp files (llvm#16…

37fff6e

…6477)

[NFC][TableGen] Adopt CodeGenHelpers in SDNodeInfoEmitter (llvm#165622)

00171b3

Use `IfDefEmitter` and `NamespaceEmitter` in SDNodeInfoEmitter.

[lldb-dap] expand tilde in dap executable path (llvm#162635)

28a279c

Users may have multiple devices and would like to resolve the homepath based on the machine they are on. expands the tilde `~` character at the front of the given file path.

[libc] Make errno asserts noop on gpu targets (llvm#166606)

3d0a367

This patch defines errno unit and integration test asserts as noop on GPU targets. Checking for errnos is tests has caused build breakages in previous patches.

[clang][SourceManager] Reuse code when computing Column and Line numb…

1041423

…ers (llvm#166593)

[AMDGPU][MC] GFX9 - Support NV bit in FLAT instructions in pre-GFX90A (…

db6231b

…llvm#154237) targets This patch enables support of the NV (non-volatile) bit in FLAT instructions in GFX9 (pre-GFX90A) targets.

[clang-format] Fix brace wrapping for Java records (llvm#164711)

6c4f968

The brace wrapping for Java records should now behave similar to classes. Before, opening braces for Java records were always placed in the same line as the record definition.

TableGen: Split RuntimeLibcallsEmitter into separate utility header (l…

0469ff0

…lvm#166583) This information will be needed in more emitters, so start factoring it to be more reusable.

[InterleavedAccess] Construct interleaved access store with shuffles

78d6491

Cost of interleaved store of 8 factor and 16 factor are cheaper in AArch64 with additional interleave instructions.

[gn build] Port 0469ff0

163933e

[LV] Add tests for narrowing interleave groups with casts.

5419097

Add additional tests for narrowing interleave groups with casts.

merge main into amd-staging

f80cec2

ronlieb requested review from a team, dpalermo and jun-amd November 5, 2025 21:50

dpalermo approved these changes Nov 5, 2025

View reviewed changes

z1-cciauto merged commit 7d640c0 into amd-staging Nov 6, 2025
14 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251105150714 branch November 6, 2025 00:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #503

merge main into amd-staging #503

Uh oh!

ronlieb commented Nov 5, 2025

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

41 participants

merge main into amd-staging #503

merge main into amd-staging #503

Uh oh!

Conversation

ronlieb commented Nov 5, 2025

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

41 participants