[AutoBump] Merge with eff6b642 (Jan 17) (50) #521

jorickert · 2025-03-20T08:03:49Z

No description provided.

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/784266

…vm#122726) …ecord level. This fixes the incorrect diagnostic emitted when compiling the following snippet ``` // string_view.h template<class _CharT> class basic_string_view; typedef basic_string_view<char> string_view; template<class _CharT> class __attribute__((__preferred_name__(string_view))) basic_string_view { public: basic_string_view() { } }; inline basic_string_view<char> foo() { return basic_string_view<char>(); } // A.cppm module; #include "string_view.h" export module A; // Use.cppm module; #include "string_view.h" export module Use; import A; ``` The diagnostic is ``` string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier ``` The underlying issue is that deserialization of the `preferred_name` attribute triggers deserialization of `basic_string_view<char>`, which triggers the deserialization of the `preferred_name` attribute again (since it's attached to the `basic_string_view` template). The deserialization logic is implemented in a way that prevents it from going on a loop in a literal sense (it detects early on that it has already seen the `string_view` typedef when trying to start its deserialization for the second time), but leaves the typedef deserialization in an unfinished state. Subsequently, the `string_view` typedef from the deserialized module cannot be merged with the same typedef from `string_view.h`, resulting in the above diagnostic. This PR resolves the problem by delaying the deserialization of the `preferred_name` attribute until the deserialization of the `basic_string_view` template is completed. As a result of deferring, the deserialization of the `preferred_name` attribute doesn't need to go on a loop since the type of the `string_view` typedef is already known when it's deserialized.

When libomp is built with -cf-protection, add endbr instructions to the start of functions for Intel CET support.

This fixes a number of issues introduced in llvm#97130 when LLVM_LIBDIR_SUFFIX is a non-empty string. Make sure that the libdir is always referenced as `lib${LLVM_LIBDIR_SUFFIX}`, not as just `lib` or `${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}`. This is the standard libdir convention for all LLVM subprojects. Using `${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}` would result in a duplicate suffix.

…d to targetShrinkDemandedConstant is not 32 or 64 (llvm#123084) See llvm#123029 for details.

…llvm#122458) The currently llvm.splice may occurs unexpected behavior if the evl of the second-to-last iteration is not VF*UF. Issue llvm#122461

Alive2: https://alive2.llvm.org/ce/z/Kgamks Closes llvm#123175. For `@foo1`, the nsw flag is propagated because we first convert it into `mul nsw nuw (shl nsw nuw X, 1), 3`.

This fixes the following tests: BOLT :: AArch64/check-init-not-moved.s BOLT :: X86/dwarf5-dwarf4-types-backward-forward-cross-reference.test BOLT :: X86/dwarf5-locexpr-referrence.test When clang is compiled with `-DENABLE_LINKER_BUILD_ID=ON`.

This merges the maintainer lists for the ARM and AArch64 backends, as many people work on both to some degree. The list includes focus areas where possible.

When building with -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON with a recent libstdc++ (e.g. from gcc 13.3.0) the testcase clang/test/Misc/warning-flags-tree.c fail with the message: ``` + diagtool tree --internal .../include/c++/13.3.0/bits/stl_algo.h:2013: In function: _ForwardIterator std::lower_bound(_ForwardIterator, _ForwardIterator, const _Tp &, _Compare) [_ForwardIterator = const diagtool::DiagnosticRecord *, _Tp = diagtool::DiagnosticRecord, _Compare = bool (*)(const diagtool::DiagnosticRecord &, const diagtool::DiagnosticRecord &)] Error: elements in iterator range [first, last) are not partitioned by the predicate __comp and value __val. Objects involved in the operation: iterator "first" @ 0x7ffea8ef2fd8 { } iterator "last" @ 0x7ffea8ef2fd0 { } ``` The reason for this error is that std::lower_bound is called on BuiltinDiagnosticsByID without it being entirely sorted. Calling std::lower_bound If the range is not sorted, the behavior of this function is undefined. This is detected when building with expensive checks. To make BuiltinDiagnosticsByID sorted we need to slightly change the order the inc-files are included. The include of DiagnosticCrossTUKinds.inc in DiagnosticNames.cpp is included too early and should be moved down directly after DiagnosticCommentKinds.inc. As a part of pull request the includes that build up BuiltinDiagnosticsByID table are extracted into a common wrapper header file AllDiagnosticKinds.inc that is used by both clang and diagtool.

Emit `R_LARCH_RELAX` relocations when expanding some macros, including: - `la.tls.ie`, `la.tls.ld`, `la.tls.gd`, `la.tls.desc`, - `call36`, `tail36`. Other macros that need to emit `R_LARCH_RELAX` relocations was implemented in llvm#72961, including: - `la.local`, `la.pcrel`, `la.pcrel` expanded as `la.abs`, `la`, `la.global`, `la/la.global` expanded as `la.pcrel`, `la.got`. Note: `la.tls.le` macro can be relaxed when expanded with `R_LARCH_TLS_LE_{HI20/ADD/LO12}_R` relocations. But if we do so, previously handwritten assembly code will occur error due to the redundant `add.{w/d}` followed by `la.tls.le`. So `la.tls.le` keeps to expands with `R_LARCH_TLS_LE_{HI20/LO12}`.

This commit add relax relocations for `tls_le` code sequence. Handwritten assembly and generating source code by clang are both affected. Scheduled `tls_le` code sequence can be relaxed normally and we can add relax relocs when code emitting according to their relocs. Other relaxable macros' code sequence cannot simply add relax relocs according to their relocs, such as `PCALA_{HI20/LO12}`, we do not want to add relax relocs when code model is large. This will be implemented in later commit.

…ed nodebug (llvm#123253) Fixes one of the crashes uncovered by llvm#118710 `getOrCreateStandaloneType` asserts that a `DIType` was created for the requested type. If the `Decl` was marked `nodebug`, however, we can't generate debug-info for it, so we would previously trigger the assert. For now keep the assertion around and check the `nodebug` at the callsite.

Should the operands of `omp.atomic.read` differ, emit an implicit cast. In case of `struct` arguments, extract the 0-th index, emit an implicit cast if required, and store at the destination. Fixes llvm#112908

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/784266

) Add some functions in `AArch64MCPlusBuilder.cpp` to support inline for AArch64.

…/FDIV into FMUL"" (llvm#123313) Reverts llvm#123289

…vm#123086) This is an NFC right now, as currently, all register and spill sizes are the same, but the spill size is the correct size to use here.

…lvm#123080) This is already defined for each register class in AArch64RegisterInfo, not hardcoding it here makes these values easier to change (perhaps based on hardware mode).

This should fix the assert failure we were getting for the darwin OS.

This commit relands the changes from "[LV]: Teach LV to recursively (de)interleave. llvm#89018" Reason for revert: - The patch exposed a bug in the IA pass, the bug is now fixed and landed by commit: llvm#122643

… ARM64X (llvm#123194)

Needed for libstdc++ 15 compatibility.

…vm#87939) To deduce whether the optimization is legal we need to compare the target features between caller and callee versions. The criteria for bypassing the resolver are the following: * If the callee's feature set is a subset of the caller's feature set, then the callee is a candidate for direct call. * Among such candidates the one of highest priority is the best match and it shall be picked, unless there is a version of the callee with higher priority than the best match which cannot be picked from a higher priority caller (directly or through the resolver). * For every higher priority callee version than the best match, there is a higher priority caller version whose feature set availability is implied by the callee's feature set. Example: Callers and Callees are ordered in decreasing priority. The arrows indicate successful call redirections. Caller Callee Explanation ========================================================================= mops+sve2 --+--> mops all the callee versions are subsets of the | caller but mops has the highest priority | mops --+ sve2 between mops and default callees, mops wins sve sve between sve and default callees, sve wins but sve2 does not have a high priority caller default -----> default sve (callee) implies sve (caller), sve2(callee) implies sve (caller), mops(callee) implies mops(caller)

Match the file style of using the ISD NodeType name for the combine/lower method name.

Since cf2e828 (SCEV: regen some tests with UTC) had the side-effect of moving an implied-via-addition test into IndVarSimplify, implication via addition is no longer covered in the SCEV tests. Fix this by writing fresh tests and checking backedge-taken output from SCEV.

Use a module directory in a test that uses another fortran test to avoid race conditions in module creation.

…123046) I think the only issue here was that we would erroneously consider functions which are "in the middle" of the function were stepping to as a part of the function, and would try to step into them (likely stepping out of the function instead) instead of giving up early.

…#123319)

Preparing to add more config options and want to group them all from most-common to project / component specific.

The compiler does not support this feature on other architectures.

Missing half variants were also added. The builtins are now consistently emitted in vector form (i.e., with a splat of the literal to the appropriate vector size).

llvm#123097) …tion of global v…" (llvm#123067)" This reverts commit 44ba43a. Adds the flag to bbc as well.

Simplifies a future patch

Add missing REQUIRES: aarch64-registered-target

…itializa…" (llvm#123330) Reverts llvm#123097 Reverting due to buildbot failure https://lab.llvm.org/buildbot/#/builders/89/builds/14577.

…llvm#123196) If a pointer gets freed, it may not be dereferenceable any longer, even though there is a dominating dereferenceable assumption. As first step, only consider assumptions if the pointer value cannot be freed if UseDerefAtPointSemantics is used. PR: llvm#123196

…ther TU (llvm#123059) Close llvm#61427 And this is also helpful to implement llvm#112294 partially. The implementation strategy mimics llvm#122887. This patch split the internal declarations from the general lookup table so that other TU can't find the internal declarations.

…el (llvm#121678) Doing so can cause the resulting displacement after frame layout to become inexpressible (or cause over/underflow currently during frame layout). Fixes the error reported in llvm#101840 (comment).

… recursing (llvm#123261) `GetAttributes` returns all attributes on a given DIE, including any attributes that the DIE references via `DW_AT_abstract_origin` and `DW_AT_specification`. However, if an attribute exists on both the referring DIE and the referenced DIE, the first one encountered will be the one that takes precendence when querying the returned `DWARFAttributes`. But there was no guarantee in which order those attributes get visited. That means there's no convenient way of ensuring that an attribute of a definition doesn't get shadowed by one found on the declaration. One use-case where we don't want this to happen is for `DW_AT_object_pointer` (which can exist on both definitions and declarations, see llvm#123089). This patch makes sure we visit the current DIE's attributes before following DIE references. I tried keeping as much of the original `GetAttributes` unchanged and just add an outer `GetAttributes` that keeps track of the DIEs we need to visit next. There's precendent for this iteration order in `llvm::DWARFDie::findRecursively` and also `lldb_private::ElaboratingDIEIterator`. We could use the latter to implement `GetAttributes`, though it also follows `DW_AT_signature` so I decided to leave it for follow-up.

…tests. NFC

phoebewang and others added 30 commits January 17, 2025 16:06

[X86][APX] Support APX + MOVRS (llvm#123264)

1274bca

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/784266

[openmp] Support CET in z_Linux_asm.S (llvm#123213)

90a05f3

When libomp is built with -cf-protection, add endbr instructions to the start of functions for Intel CET support.

[AArch64] Return early rather than asserting when Size of value passe…

c8ba551

…d to targetShrinkDemandedConstant is not 32 or 64 (llvm#123084) See llvm#123029 for details.

[LV][EVL] Disable fixed-order recurrence idiom with EVL tail folding. (…

9720be9

…llvm#122458) The currently llvm.splice may occurs unexpected behavior if the evl of the second-to-last iteration is not VF*UF. Issue llvm#122461

[InstCombine] Handle mul in maintainNoSignedWrap (llvm#123299)

0e13ce7

Alive2: https://alive2.llvm.org/ce/z/Kgamks Closes llvm#123175. For `@foo1`, the nsw flag is propagated because we first convert it into `mul nsw nuw (shl nsw nuw X, 1), 3`.

[LLVM] Update AArch64 maintainers (llvm#120440)

58903c9

This merges the maintainer lists for the ARM and AArch64 backends, as many people work on both to some degree. The list includes focus areas where possible.

[llvm][OpenMP] Add implicit cast to omp.atomic.read (llvm#114659)

d7e48fb

Should the operands of `omp.atomic.read` differ, emit an implicit cast. In case of `struct` arguments, extract the 0-th index, emit an implicit cast if required, and store at the destination. Fixes llvm#112908

[X86][APX] Support APX + AMX-MOVRS/AMX-TRANSPOSE (llvm#123267)

fbb9d49

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/784266

[BOLT][AArch64]support inline-small-functions for AArch64 (llvm#120187

ee42822

) Add some functions in `AArch64MCPlusBuilder.cpp` to support inline for AArch64.

Revert "Revert "[InstCombine] Transform high latency, dependent FSQRT…

3b3590a

…/FDIV into FMUL"" (llvm#123313) Reverts llvm#123289

[AArch64] Use spill size when calculating callee saves size (NFC) (ll…

2c9dc08

…vm#123086) This is an NFC right now, as currently, all register and spill sizes are the same, but the spill size is the correct size to use here.

[AArch64] Avoid hardcoding spill size/align in FrameLowering (NFC) (l…

32a4650

…lvm#123080) This is already defined for each register class in AArch64RegisterInfo, not hardcoding it here makes these values easier to change (perhaps based on hardware mode).

[InstCombine] Fixup commit 7253c6f (llvm#123315)

e79bb87

This should fix the assert failure we were getting for the darwin OS.

Reland: [LV]: Teach LV to recursively (de)interleave. (llvm#122989)

9491f75

This commit relands the changes from "[LV]: Teach LV to recursively (de)interleave. llvm#89018" Reason for revert: - The patch exposed a bug in the IA pass, the bug is now fixed and landed by commit: llvm#122643

[LLD][COFF] Process bitcode files separately for each symbol table on…

b068f2f

… ARM64X (llvm#123194)

[MLIR] Add missing include (NFC)

101109f

Needed for libstdc++ 15 compatibility.

[X86] Rename combineScalarToVector to combineSCALAR_TO_VECTOR. NFC.

ad282f4

Match the file style of using the ISD NodeType name for the combine/lower method name.

[Flang] Use a module directory to avoid race condition (llvm#123215)

437834e

Use a module directory in a test that uses another fortran test to avoid race conditions in module creation.

[AMDGPU] Fix printing hasInitWholeWave in mir (llvm#123232)

21704a6

[bazel] Add new file added in 437834e

0d7c8c0

RKSimon and others added 15 commits January 17, 2025 11:55

[X86] Fix logical operator warnings. NFC.

a864906

[clang][bytecode] Add InitLinkScope for toplevel Expr temporary (llvm…

7075eee

…#123319)

[NFC][Offload] Structure/Readability of CMake cache (llvm#123328)

61f94eb

Preparing to add more config options and want to group them all from most-common to project / component specific.

[lldb] Skip TestStepUntilAPI on !x86_64, !aarch64

58fc802

The compiler does not support this feature on other architectures.

[libclc] Move degrees/radians to CLC library & optimize (llvm#123222)

a90b5b1

Missing half variants were also added. The builtins are now consistently emitted in vector form (i.e., with a splat of the literal to the appropriate vector size).

Revert "Revert "[Flang][Driver] Add a flag to control zero initializa… (

8c63648

llvm#123097) …tion of global v…" (llvm#123067)" This reverts commit 44ba43a. Adds the flag to bbc as well.

[DAG] Add SDPatternMatch::m_BitCast matcher (llvm#123327)

bacfdcd

Simplifies a future patch

Fix for buildbot errors on non-aarch64 targets. (llvm#123322)

ce3280a

Add missing REQUIRES: aarch64-registered-target

Revert "Revert "Revert "[Flang][Driver] Add a flag to control zero in…

8a229f5

…itializa…" (llvm#123330) Reverts llvm#123097 Reverting due to buildbot failure https://lab.llvm.org/buildbot/#/builders/89/builds/14577.

[AArch64][GlobalISel] Update and regenerate some vecreduce and other …

eff6b64

…tests. NFC

[AutoBump] Merge with eff6b64 (Jan 17)

ca75a9f

jorickert merged commit 0f6f0b4 into bump_to_0bd07652 Apr 14, 2025
4 checks passed

jorickert deleted the bump_to_eff6b642 branch April 14, 2025 07:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AutoBump] Merge with eff6b642 (Jan 17) (50) #521

[AutoBump] Merge with eff6b642 (Jan 17) (50) #521

Uh oh!

jorickert commented Mar 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

31 participants

[AutoBump] Merge with eff6b642 (Jan 17) (50) #521

[AutoBump] Merge with eff6b642 (Jan 17) (50) #521

Uh oh!

Conversation

jorickert commented Mar 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

31 participants