Skip to content

Conversation

@jorickert
Copy link

No description provided.

phoebewang and others added 30 commits January 17, 2025 16:06
…vm#122726)

…ecord level.

This fixes the incorrect diagnostic emitted when compiling the following
snippet

```
// string_view.h
template<class _CharT>
class basic_string_view;

typedef basic_string_view<char> string_view;

template<class _CharT>
class
__attribute__((__preferred_name__(string_view)))
basic_string_view {
public:
    basic_string_view() 
    {
    }
};

inline basic_string_view<char> foo()
{
  return basic_string_view<char>();
}
// A.cppm
module;
#include "string_view.h"
export module A;

// Use.cppm
module;
#include "string_view.h"
export module Use;
import A;
```

The diagnostic is 
```
string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier
```

The underlying issue is that deserialization of the `preferred_name`
attribute triggers deserialization of `basic_string_view<char>`, which
triggers the deserialization of the `preferred_name` attribute again
(since it's attached to the `basic_string_view` template).
The deserialization logic is implemented in a way that prevents it from
going on a loop in a literal sense (it detects early on that it has
already seen the `string_view` typedef when trying to start its
deserialization for the second time), but leaves the typedef
deserialization in an unfinished state. Subsequently, the `string_view`
typedef from the deserialized module cannot be merged with the same
typedef from `string_view.h`, resulting in the above diagnostic.

This PR resolves the problem by delaying the deserialization of the
`preferred_name` attribute until the deserialization of the
`basic_string_view` template is completed. As a result of deferring, the
deserialization of the `preferred_name` attribute doesn't need to go on
a loop since the type of the `string_view` typedef is already known when
it's deserialized.
When libomp is built with -cf-protection, add endbr instructions to the
start of functions for Intel CET support.
This fixes a number of issues introduced in llvm#97130 when
LLVM_LIBDIR_SUFFIX is a non-empty string. Make sure that the libdir is
always referenced as `lib${LLVM_LIBDIR_SUFFIX}`, not as just `lib` or
`${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}`.

This is the standard libdir convention for all LLVM subprojects. Using
`${CMAKE_INSTALL_LIBDIR}${LLVM_LIBDIR_SUFFIX}` would result in a
duplicate suffix.
…d to targetShrinkDemandedConstant is not 32 or 64 (llvm#123084)

See llvm#123029 for details.
…llvm#122458)

The currently llvm.splice may occurs unexpected behavior if the evl of
the second-to-last iteration is not VF*UF.

Issue llvm#122461
Alive2: https://alive2.llvm.org/ce/z/Kgamks
Closes llvm#123175.

For `@foo1`, the nsw flag is propagated because we first convert it into
`mul nsw nuw (shl nsw nuw X, 1), 3`.
This fixes the following tests:

    BOLT :: AArch64/check-init-not-moved.s
    BOLT :: X86/dwarf5-dwarf4-types-backward-forward-cross-reference.test
    BOLT :: X86/dwarf5-locexpr-referrence.test

When clang is compiled with `-DENABLE_LINKER_BUILD_ID=ON`.
This merges the maintainer lists for the ARM and AArch64 backends,
as many people work on both to some degree. The list includes
focus areas where possible.
When building with -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON with a recent
libstdc++ (e.g. from gcc 13.3.0) the testcase
clang/test/Misc/warning-flags-tree.c fail with the message:
```
+ diagtool tree --internal
 .../include/c++/13.3.0/bits/stl_algo.h:2013:
In function:
    _ForwardIterator std::lower_bound(_ForwardIterator, _ForwardIterator,
    const _Tp &, _Compare) [_ForwardIterator = const
    diagtool::DiagnosticRecord *, _Tp = diagtool::DiagnosticRecord, _Compare
    = bool (*)(const diagtool::DiagnosticRecord &, const
    diagtool::DiagnosticRecord &)]

Error: elements in iterator range [first, last) are not partitioned by the predicate __comp and value __val.

Objects involved in the operation:
    iterator "first" @ 0x7ffea8ef2fd8 {
    }
    iterator "last" @ 0x7ffea8ef2fd0 {
    }
```
The reason for this error is that std::lower_bound is called on
BuiltinDiagnosticsByID without it being entirely sorted. Calling
std::lower_bound If the range is not sorted, the behavior of this
function is undefined. This is detected when building with expensive
checks.

To make BuiltinDiagnosticsByID sorted we need to slightly change the
order the inc-files are included. The include of
DiagnosticCrossTUKinds.inc in DiagnosticNames.cpp is included too early
and should be moved down directly after DiagnosticCommentKinds.inc.

As a part of pull request the includes that build up
BuiltinDiagnosticsByID table are extracted into a common wrapper header
file AllDiagnosticKinds.inc that is used by both clang and diagtool.
Emit `R_LARCH_RELAX` relocations when expanding some macros, including:

- `la.tls.ie`, `la.tls.ld`, `la.tls.gd`, `la.tls.desc`,
- `call36`, `tail36`.

Other macros that need to emit `R_LARCH_RELAX` relocations was
implemented in llvm#72961, including:

- `la.local`, `la.pcrel`, `la.pcrel` expanded as `la.abs`, `la`,
`la.global`, `la/la.global` expanded as `la.pcrel`, `la.got`.

Note: `la.tls.le` macro can be relaxed when expanded with
`R_LARCH_TLS_LE_{HI20/ADD/LO12}_R` relocations. But if we do so,
previously handwritten assembly code will occur error due to the
redundant `add.{w/d}` followed by `la.tls.le`. So `la.tls.le` keeps to
expands with `R_LARCH_TLS_LE_{HI20/LO12}`.
This commit add relax relocations for `tls_le` code sequence.
Handwritten assembly and generating source code by clang are both
affected.

Scheduled `tls_le` code sequence can be relaxed normally and we can add
relax relocs when code emitting according to their relocs. Other
relaxable macros' code sequence cannot simply add relax relocs according
to their relocs, such as `PCALA_{HI20/LO12}`, we do not want to add
relax relocs when code model is large. This will be implemented in later
commit.
…ed nodebug (llvm#123253)

Fixes one of the crashes uncovered by
llvm#118710

`getOrCreateStandaloneType` asserts that a `DIType` was created for the
requested type. If the `Decl` was marked `nodebug`, however, we can't
generate debug-info for it, so we would previously trigger the assert.
For now keep the assertion around and check the `nodebug` at the
callsite.
Should the operands of `omp.atomic.read` differ, emit an implicit cast.
In case of `struct` arguments, extract the 0-th index, emit an implicit
cast if required, and store at the destination.

Fixes llvm#112908
)

Add some functions in `AArch64MCPlusBuilder.cpp` to support inline for
AArch64.
…vm#123086)

This is an NFC right now, as currently, all register and spill sizes are
the same, but the spill size is the correct size to use here.
…lvm#123080)

This is already defined for each register class in AArch64RegisterInfo,
not hardcoding it here makes these values easier to change (perhaps
based on hardware mode).
This should fix the assert failure we were getting for the darwin OS.
This commit relands the changes from "[LV]: Teach LV to recursively
(de)interleave. llvm#89018"

Reason for revert:
- The patch exposed a bug in the IA pass, the bug is now fixed and landed by commit: llvm#122643
Needed for libstdc++ 15 compatibility.
…vm#87939)

To deduce whether the optimization is legal we need to compare the target
features between caller and callee versions. The criteria for bypassing
the resolver are the following:

 * If the callee's feature set is a subset of the caller's feature set,
   then the callee is a candidate for direct call.

 * Among such candidates the one of highest priority is the best match
   and it shall be picked, unless there is a version of the callee with
   higher priority than the best match which cannot be picked from a
   higher priority caller (directly or through the resolver).

 * For every higher priority callee version than the best match, there
   is a higher priority caller version whose feature set availability
   is implied by the callee's feature set.

Example:

Callers and Callees are ordered in decreasing priority.
The arrows indicate successful call redirections.

  Caller        Callee      Explanation
=========================================================================
mops+sve2 --+--> mops       all the callee versions are subsets of the
            |               caller but mops has the highest priority
            |
     mops --+    sve2       between mops and default callees, mops wins

      sve        sve        between sve and default callees, sve wins
                            but sve2 does not have a high priority caller

  default -----> default    sve (callee) implies sve (caller),
                            sve2(callee) implies sve (caller),
                            mops(callee) implies mops(caller)
Match the file style of using the ISD NodeType name for the combine/lower method name.
Since cf2e828 (SCEV: regen some tests with UTC) had the side-effect of
moving an implied-via-addition test into IndVarSimplify, implication via
addition is no longer covered in the SCEV tests. Fix this by writing
fresh tests and checking backedge-taken output from SCEV.
Use a module directory in a test that uses another fortran test to avoid
race conditions in module creation.
…123046)

I think the only issue here was that we would erroneously consider
functions which are "in the middle" of the function were stepping to as
a part of the function, and would try to step into them (likely stepping
out of the function instead) instead of giving up early.
RKSimon and others added 15 commits January 17, 2025 11:55
Preparing to add more config options and want to group them all from
most-common to project / component specific.
The compiler does not support this feature on other architectures.
Missing half variants were also added.

The builtins are now consistently emitted in vector form (i.e., with a
splat of the literal to the appropriate vector size).
llvm#123097)

…tion of global v…" (llvm#123067)"

This reverts commit 44ba43a.

Adds the flag to bbc as well.
Add missing REQUIRES: aarch64-registered-target
…llvm#123196)

If a pointer gets freed, it may not be dereferenceable any longer, even
though there is a dominating dereferenceable assumption. As first step,
only consider assumptions if the pointer value cannot be freed if
UseDerefAtPointSemantics is used.

PR: llvm#123196
…ther TU (llvm#123059)

Close llvm#61427

And this is also helpful to implement
llvm#112294 partially.

The implementation strategy mimics
llvm#122887. This patch split the
internal declarations from the general lookup table so that other TU
can't find the internal declarations.
…el (llvm#121678)

Doing so can cause the resulting displacement after frame layout to
become inexpressible (or cause over/underflow currently during frame
layout).

Fixes the error reported in
llvm#101840 (comment).
… recursing (llvm#123261)

`GetAttributes` returns all attributes on a given DIE, including any
attributes that the DIE references via `DW_AT_abstract_origin` and
`DW_AT_specification`. However, if an attribute exists on both the
referring DIE and the referenced DIE, the first one encountered will be
the one that takes precendence when querying the returned
`DWARFAttributes`. But there was no guarantee in which order those
attributes get visited. That means there's no convenient way of ensuring
that an attribute of a definition doesn't get shadowed by one found on
the declaration. One use-case where we don't want this to happen is for
`DW_AT_object_pointer` (which can exist on both definitions and
declarations, see llvm#123089).

This patch makes sure we visit the current DIE's attributes before
following DIE references. I tried keeping as much of the original
`GetAttributes` unchanged and just add an outer `GetAttributes` that
keeps track of the DIEs we need to visit next.

There's precendent for this iteration order in
`llvm::DWARFDie::findRecursively` and also
`lldb_private::ElaboratingDIEIterator`. We could use the latter to
implement `GetAttributes`, though it also follows `DW_AT_signature` so I
decided to leave it for follow-up.
@jorickert jorickert merged commit 0f6f0b4 into bump_to_0bd07652 Apr 14, 2025
4 checks passed
@jorickert jorickert deleted the bump_to_eff6b642 branch April 14, 2025 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.