Skip to content

[AutoBump] Merge with 5bf37484 (Feb 20) (64) #608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: bump_to_1e1bf797
Choose a base branch
from

Conversation

jorickert
Copy link

No description provided.

mordante and others added 30 commits February 20, 2025 17:47
This fixes two bugs in the ABI for over-sized bitfields for ARM and
AArch64:

The container type picked for an over-sized bitfield already contributes
to the alignment of the structure, but it should also contribute to the
"unadjusted alignment" which is used by the ARM and AArch64 PCS.

AAPCS64 defines the bitfield layout algorithm for over-sized bitfields
as picking a container which is the fundamental integer data type with
the largest size less than or equal to the bit-field width. Since
AAPCS64 has a 128-bit integer fundamental data type, we need to consider
Int128 as a container type for AArch64.
…127543)

Currently we only check if the pointers involved in runtime checks do
not wrap if we need to perform dependency checks. If that's not the
case, we generate runtime checks, even if the pointers may wrap (see
test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll).

If the pointer wraps, then we swap start and end of the runtime check,
leading to incorrect checks.

An Alive2 proof of what the runtime checks are checking conceptually (on
i4 to have it complete in reasonable time) showing the incorrect result
should be https://alive2.llvm.org/ce/z/KsHzn8

Depends on llvm#127410 to avoid
more regressions.

PR: llvm#127543
… to worklist (llvm#127999)

We already push the old shuffles to the worklist as part of the replaceValue calls, so we shouldn't need to add them to the deferred list as well - my guess is this was to ensure that the instructions got erased first to help cleanup unused instructions, but eraseInstruction should handle this now.
Resolves llvm#115394

1. Move definitions of cross-platform `getc` `ungetc` to `reader.h`.
2. Remove function pointer members to define them once per platform in
`.h`
3. Built in overlay mode in macOS m1
4. Remove `reader.cpp` as it's empty now


Also, full build doesn't yet build on macos m1 AFAIK
We have a lot of casts near this to avoid undefined behavior or
arithmetic on arbitrary signed integers, but the casts removed here
don't appear to be necessary.
…generation (llvm#127727)

Up until now the generation of vector instructions was taking place
during the top-down post-order traversal of vectorizeRec(). The issue
with this approach is that the vector instructions emitted during the
traversal can be reordered by the scheduler, making it challenging to
place them without breaking the def-before-uses rule.

With this patch we separate the vectorization decisions (done in
`vectorizeRec()`) from the code generation phase (`emitVectors()`). The
vectorization decisions are stored in the `Actions` vector and are used
by `emitVectors()` to drive code generation.
…126975)

`-Wunsafe-buffer-usage-in-libc-call` is a subgroup of
`-Wunsafe-buffer-usage` that warns about unsafe libc function calls.
Translates `cbuffer` declaration blocks to `target("dx.CBuffer")` type. Creates global variables in `hlsl_constant` address space for all `cbuffer` constant and adds metadata describing which global constant belongs to which constant buffer. For explicit constant buffer layout information an explicit layout type `target("dx.Layout")` is used. This might change in the future.

The constant globals are temporary and will be removed in upcoming pass that will translate `load` instructions in the `hlsl_constant` address space to constant buffer load intrinsics calls off a CBV handle (llvm#124630, llvm#112992).

See [Constant buffer design
doc](llvm/wg-hlsl#94) for more details.

Fixes llvm#113514, llvm#106596
This patch gets rid of the file restriction for running the new premerge
Github workflow on PRs. This will cause the jobs to be run on all the
PRs. Currently the jobs will succeed regardless of build/test failure
results. This will let us test the new infra hopefully without too much
disruption before eventually letting jobs fail when builds/tests fail
and deprecating the existing premerge system.

This is part of the launch plan as outlined in

https://discourse.llvm.org/t/googles-plan-for-the-llvm-presubmit-infrastructure/78940.
…lvm#123470)

With the goal of eventually being able to make `-Wreturn-type` default to an 
error in all language modes, this is a follow-up to llvm#123464 and updates even
more tests, mainly clang-tidy and clangd tests.
… NFC

Prefer the nonstatic member by converting unsigned to Register instead.
In llvm#121215 the reader was reorganized and the definitions of the
internal getc and ungetc functions were moved, but the includes that the
GPU builder depends on were not. This patch moves the includes to the
correct new place.
Currently the alias analysis doesn't trace the source whenever there are
operations from fir::cg dialect. This PR added support for
fir::cg::XEmboxOp, fir::cg::XReboxOp, fir::cg::XDeclareOp for a specific
application i'm working on.
llvm#123470 broke one of the clang-tidy tests; this fixes that.
Make StreamAsynchronousIO an unique_ptr instead of a shared_ptr. I tried
passing the class by value, but the llvm::raw_ostream forwarder stored
in the Stream parent class isn't movable and I don't think it's worth
changing that. Additionally, there's a few places that expect a
StreamSP, which are easily created from a StreamUP.
* Add label that identifies constant island.
* Support cases where the island is located after the function.
This is a re-apply of 083c683 with a
fix for the flang runtime build.

This works the same way as LLVM_PARALLEL_COMPILE_JOBS except that it is
specific to the flang source rather than for the whole project.

Configuring with -DFLANG_PARALLEL_COMPILE_JOBS=1 would mean that there
would only ever be one flang source being compiled at a time.

Some of the flang sources require large amounts of memory to compile, so
this option can be used to avoid OOM erros when compiling those files
while still allowing the rest of the project to compile using the
maximum number of jobs.

Update flang/CMakeLists.txt

---------

Co-authored-by: Nikita Popov <github@npopov.com>
Co-authored-by: Michael Kruse <github@meinersbur.de>
…ow (llvm#127921)

NFC: Small refactor to `calculateLegacyCbufferSize()`'s control flow to
make each branch easier to flow/more visually distinct from each other
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.