Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to LLVM 18.1.0 rc 3 #167

Merged
merged 55 commits into from
Feb 21, 2024
Merged

Conversation

nikic
Copy link

@nikic nikic commented Feb 21, 2024

Lewuathe and others added 30 commits February 13, 2024 11:39
The buildbot test running on s390x platform keeps failing since [this
time](https://lab.llvm.org/buildbot/#/builders/199/builds/31136). This
is because of the dependency on the endianness of the platform. It
expects the format invalid in the big endian platform (s390x). We can
simply skip it.

See: https://discourse.llvm.org/t/mlir-s390x-linux-failure/76695
(cherry picked from commit 65ac8c1)
…4927)

This adds support for marking arbitrary general purpose registers -
except for those with special purpose (G0, I6-I7, O6-O7) - as reserved,
as needed by some software like the Linux kernel.

(cherry picked from commit c2f9885)
…bj (llvm#81463)

If llvm-readobj is built with a 32 bit time_t, it can't print such
timestamps correctly.

(cherry picked from commit 0bf4ff2)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
Having the test in the header requires including unistd.h on POSIX
platforms. This header has other declarations which may conflict with
code that uses named declarations provided by this header. For example
code using "int pipe;" would conflict with the function pipe in this
header.

Moving the code to the dylib means std::print would not be available on
Apple backdeployment targets. On POSIX platforms there is no transcoding
required so a not Standard conforming implementation is still a useful
and the observable differences are minimal. This behaviour has been done
for print before llvm#76293.

Note questions have been raised in LWG4044 "Confusing requirements for
std::print on POSIX platforms", whether or not the isatty check on POSIX
platforms is required. When this LWG issue is resolved the
backdeployment targets could become Standard compliant.

This patch is intended to be backported to the LLVM-18 branch.

Fixes: llvm#79782
(cherry picked from commit 4fb7b33)
…m#79180)" (llvm#80238)

This reverts commit bdc4110 on the
release/18.x branch.  This change was the first in a mini-series
and while I'm not aware of any particular problem from having it on
it's own in the branch, it seems safer to ship with the previous
known good state.
This enables specifing "za" or "zt0" to the clobber list for inline asm.
This complies with the acle SME addition to the asm extension here:
ARM-software/acle#276

(cherry picked from commit d9c20e4)
Adding PowerPC updates for clang and llvm into the V18.1.0 release
notes.

---------

Co-authored-by: Maryam Moghadas <maryammo@ca.ibm.com>
…la macro (llvm#80644)

When parsing the `la` macro, we add a duplicate `$` prefix in
`getOrCreateSymbol`,
leading to `error: Undefined temporary symbol $$yy` for code like:

```
xx:
	la	$2,$yy
$yy:
	nop
```

Remove the duplicate prefix.

In addition, recognize `.L`-prefixed symbols as local for O32.

See: llvm#65020.

---------

Co-authored-by: Fangrui Song <i@maskray.me>
(cherry picked from commit c007fbb)
…lvm#81834)

In fact, the cleanup attribute is only added to the CFG, but still
unhandled by CSA.
I propose dropping this false "support" statement from the docs.
… larger than UINT64_MAX. (llvm#81888)

There are no checks that the type is legal so we need to handle any
type.

(cherry picked from commit b57ba8e)
…l reg destination. (llvm#81938)

If it isn't virtual, we may extend the live range of the physical
register past were it is valid. For example, across a call.

Found while trying to enable -riscv-enable-sink-fold which enables some
copy propagation in machine sink that led to ADDIs with physical
register destinations.

(cherry picked from commit feee627)
…vm#81113)

This was added in 4b7beab. When the
flag was added implicitly elsewhere, it was added via
llvm/cmake/modules/HandleLLVMOptions.cmake, where it wasn't added on
Windows/Cygwin targets.

This avoids one warning per object file in OpenMP.

(cherry picked from commit 72f04fa)
…m#81256)

This optimization tries to optimize bitcasts from `<N x i1>` to iN, but
currently also triggers for `<N x i1>` to `<M x iK>` bitcasts, if custom
lowering has been requested for these for an unrelated reason. Fix this
by explicitly checking that the result type is scalar.

Fixes llvm#81216.

(cherry picked from commit 92d7992)
…ns (llvm#81673)

Function annotation, as part of llvm.metadata, is for the function
itself and doesn't apply to its corresponding jump table entry, so with
CFI we shouldn't replace function pointer in function annotation with
pointer to its corresponding jump table entry.

(cherry picked from commit c7a0db1)
…options (llvm#81475)

These were implemented in the COFF linker in
3923e61 and
d12b99a.

This matches the corresponding options in the ELF linker.

(cherry picked from commit d033366)
…t cases (llvm#79895)

This patch flips bit-fields in `struct flags` for big-endian in test
cases to be consistent with the definition of the structure in libomp
`kmp.h`.

(cherry picked from commit 7a9b0e4)
…mbers swapped for big-endian (llvm#79188)

The direct lock data structure has bit `0` (the least significant bit)
of the first 32-bit word set to `1` to indicate it is a direct lock. On
the other hand, the first word (in 32-bit mode) or first two words (in
64-bit mode) of an indirect lock are the address of the entry allocated
from the indirect lock table. The runtime checks bit `0` of the first
32-bit word to tell if this is a direct or an indirect lock. This works
fine for 32-bit and 64-bit little-endian because its memory layout of a
64-bit address is (`low word`, `high word`). However, this causes
problems for big-endian where the memory layout of a 64-bit address is
(`high word`, `low word`). If an address of the indirect lock table
entry is something like `0x110035300`, i.e., (`0x1`, `0x10035300`), it
is treated as a direct lock. This patch defines `struct
kmp_base_tas_lock` with the ordering of the two 32-bit members flipped
for big-endian PPC64 so that when checking/setting tags in member
`poll`, the second word (the low word) is used. This patch also changes
places where `poll` is not already explicitly specified for
checking/setting tags.

(cherry picked from commit ac97562)
This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
(cherry picked from commit fe3406e)
…lvm#80044)

The change is included in the 18.x release. Move the release note to the
release branch and reformat.

(cherry picked from commit b40d5b1)
Refer to commit 6611d58 ("Relax R_RISCV_ALIGN"), we can relax
R_LARCH_ALIGN by same way. Reuse `SymbolAnchor`, `RISCVRelaxAux` and
`initSymbolAnchors` to simplify codes. As `riscvFinalizeRelax` is an
arch-specific function, put it override on `TargetInfo::finalizeRelax`,
so that LoongArch can override it, too.

The flow of relax R_LARCH_ALIGN is almost consistent with RISCV. The
difference is that LoongArch only has 4-bytes NOP and all executable
insn is 4-bytes aligned. So LoongArch not need rewrite NOP sequence.
Alignment maxBytesEmit parameter is supported in psABI v2.30.

(cherry picked from commit 06a728f)
Add review references to all items already mentioned.

Move some items to the right section (from the MinGW section to COFF, as
the implementation is in the COFF linker side, and may be relevant for
non-MinGW cases as well).
…node

before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.

(cherry picked from commit 48bbd76)
multiregister node.

If the node can be span between several registers and same
extractelement instruction is used in several parts, it may be required
to keep such extractelement instruction to avoid compiler crash.

(cherry picked from commit 6fe21bc)
This CMakeLists.txt is used to build modules without build system
support. This was removed in d06ae33.
This is used in the documentation how to use modules.

Made some minor changes to make it work with the std.compat module using
the std module.

Note the CMakeLists.txt in the build dir should be removed once build
system support is generally available.

(cherry picked from commit fc0e9c8)
This makes it easier to run the tests in a containerized environment.

(cherry picked from commit e165bea)
…1739)

With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:

                        9e8d8: R_390_TLS_TPOFF  *ABS*+0x18

This is caused by the config->isPic conditon in addTpOffsetGotEntry:

 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
   if (!sym.isPreemptible && !config->isPic) {
     in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
     return;
   }

It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.

Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)

(cherry picked from commit 6f90773)
davemgreen and others added 25 commits February 17, 2024 09:03
…1873)

If, like powi on windows, the libcall is unavailable we should fall back
to SDAG. Currently we try and generate a call to "".

(cherry picked from commit 47c65cf)
to satisfy the __start___llvm_orderfile reference when linking with
-bexpfull and -fprofile-generate on AIX.

(cherry picked from commit 15cccc5)
…tem stack size is too big (llvm#81996)

This patch sets the stack size of worker threads to `2 x
KMP_DEFAULT_STKSIZE` (2 x 4MB) for AIX if the system stack size is too
big. Also defines maximum stack size for 32-bit AIX.

(cherry picked from commit 2de269a)
This patch adds the missing `subnormal -> normal` part for `fpext` in
`computeKnownFPClass`.
Fixes the miscompilation reported by
llvm#80941 (comment).

(cherry picked from commit a5865c3)
llvm#81107)

Otherwise we will crash since target intrinsics don't have their types
legalized. Let the mgather get legalized first, then do the combine on
the legal type.
Fixes llvm#81088

Co-authored-by: Craig Topper <craig.topper@sifive.com>
(cherry picked from commit 06c89bd)
Extend the transform added in
llvm#76458 to also handle unsigned
division. X exact/ Y * Y == X holds independently of whether the
division is signed or unsigned.

Proofs: https://alive2.llvm.org/ce/z/wFd5Ec
(cherry picked from commit 26d4afc)
…nnamed common block definitions (llvm#81770)

This patch adds assembly file `z_AIX_asm.S` that contains the 32- and
64-bit XCOFF version of microtasking routines and unnamed common block
definitions. This code has been run through the libomp LIT tests and a
user package successfully.

(cherry picked from commit 94100bc)
In IR or C code, shift amount larger than value size is undefined
behavior. But in practice, backend lowering for shift_parts produces
add/sub of shift amounts, thus constant shift amounts might be
negative or larger than value size, which depends on ISA definition.

PowerPC ISA says, the lowest 7 bits (6 bits for 32-bit instruction)
will be taken, and if the highest among them is 1, result will be
zero, otherwise the low 6 bits (or 5 on 32-bit) are used as shift
amount.

This commit emulates the behavior and avoids array overflow in bit
permutation's value bits calculator.

(cherry picked from commit 292d9e8)
This is also necessary for enabling ClangBuiltLinux:
ClangBuiltLinux/linux#1530

(cherry picked from commit 3c02cb7)
Close llvm#80570.

In

llvm@a0b6747,
we skipped ODR checks for decls in GMF. Then it should be natural to
skip storing the ODR values in BMI.

Generally it should be fine as long as the writer and the reader keep
consistent.

However, the use of preamble in clangd shows the tricky part.

For,

```
// test.cpp
module;

// any one off these is enough to crash clangd
// #include <iostream>
// #include <string_view>
// #include <cmath>
// #include <system_error>
// #include <new>
// #include <bit>
// probably many more

// only ok with libc++, not the system provided libstdc++ 13.2.1

// these are ok

export module test;
```

clangd will store the headers as preamble to speedup the parsing and the
preamble reuses the serialization techniques. (Generally we'd call the
preamble as PCH. However it is not true strictly. I've tested the PCH
wouldn't be problematic.) However, the tricky part is that the preamble
is not modules. It literally serialiaze and deserialize things. So
before clangd parsing the above test module, clangd will serialize the
headers into the preamble. Note that there is no concept like GMF now.
So the ODR bits are stored. However, when clangd parse the file
actually, the decls from preamble are thought as in GMF literally, then
hte ODR bits are skipped. Then mismatch happens.

To solve the problem, this patch adds another bit for decls to record
whether or not the ODR bits are skipped.

(cherry picked from commit 49775b1)
(cherry picked from commit c105848)
SCEV treats "or disjoint" the same as "add nsw nuw". However, when
expanding, we cannot generally replace an add SCEV node with an "or
disjoint" instruction. Just dropping the poison flag is insufficient in
this case, we would have to actually convert the or into an add.

This is a partial fix for llvm#79861.

(cherry picked from commit 5b8e1a6)
To allow reusing it in IndVars.

(cherry picked from commit 43dd1e8)
)

IndVars may replace an instruction with one of its operands, if they
have the same SCEV expression. However, such a replacement may be more
poisonous.

First, check whether the operand being poison implies that the
instruction is also poison, in which case the replacement is always
safe. If this fails, check whether SCEV can determine that reusing the
instruction is safe, using the same check as SCEVExpander.

Fixes llvm#79861.

(cherry picked from commit 7d2b6f0)
)

As described in [test-release.sh ninja install does builds in Phase
3](llvm#80999), considerable
parts of Phase 3 of a `test-release.sh` build are run by `ninja
install`, ignoring both `$Verbose` and the parallelism set via `-j NUM`.

This patches fixes this by not specifying any explicit build target for
Phase 3, thus running the full build as usual.

Tested on `sparc64-unknown-linux-gnu`.

(cherry picked from commit f6ac598)
The default GitHub token does not have read permissions on the org, so
we need to use a custom token in order to read the members of the
llvm-release-managers team.

(cherry picked from commit 2836d8e)
We need to do this now that we are bumping the minor release number when
we create the release branch.

This also results in a slight change to the library names for LLVM. The
main library now has a more convential library name:
'libLLVM.so.$major.$minor'. The old library name: libLLVM-$major.so is
now a symlink that points to the new library. However, the symlink is
not present in the build directory. It is only present in the install
directory.

The library name was changed because it helped to keep the CMake changes
more simple.

Fixes llvm#76273

(cherry picked from commit 91a3846)
This was broken by 91a3846.

(cherry picked from commit ff4d6c6)
@nagisa nagisa merged commit edb85c3 into rust-lang:rustc/18.0-2024-02-13 Feb 21, 2024
vext01 pushed a commit to vext01/llvm-project that referenced this pull request May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet