Skip to content

Conversation

@cdevadas
Copy link

No description provided.

lenary and others added 30 commits November 25, 2025 20:03
This is noted by the specification, and should save a dynamic
instruction.

Code size should be no worse than before, as the pairs of moves can
usually be turned into two 16-bit moves, but `fmv.d` is always a 32-bit
instruction.

LLVM can look through a `FSGNJ_D_IN32X`, in
`RISCVInstrInfo::isCopyInstrImpl` which helps copy propagation.
Automated with `sed -i 's/\.Value//g' lib/Target/AMDGPU/*.td` plus a
tiny bit of manual reformatting.
This upstreams the code to handle member initialization for non-record
arrays.
When individual elements of a vector are updated via vector swizzle, it needs to be handled as separate store operations to the individual vector elements.

Clang treats vectors as one unit, so if a part of a vector needs to be updated, the whole vector is loaded, some elements modified, and then the whole vector is stored.

In HLSL vector elements are handled separately. We need to avoid this load/modify/store sequence to prevent overwriting other vector elements that might be getting updated in parallel.

Fixes llvm#152815
Flags are now passed on construction/cloning. Remove unnecessary
transferFlags call, and make code independent of VPRecipeWithIRFlags, to
support additional recipes in the future.
It is possible that a fork could occur while MutexTSDs is being held and
then cause a deadlock in a forked process when something attempts to
lock it again. Instead add it to the enable/disable list of mutexes.
Add test cases for canonicalizing AddRecs that may wrap.
…-vector-to-llvm`." (llvm#169570)

Reverts llvm#166204

There was a build issue due to a missing dependency.
libcxx requires minimal macOS 11 to build. This patch bumps the minimal
OS X target in Fuchsia's cmake cache file to 11.0 to satisfy this
requirement.
…69569)

Some globals (e.g., fir.global) have initialization regions that may
transitively reference other globals or type descriptors. Add
getInitRegion() to GlobalVariableOpInterface to retrieve these regions,
returning Region* (nullptr if the global uses attributes for
initialization, as with memref.global).
…m#166597)

Currently, -gsplit-dwarf and -mrelax are incompatible options in Clang.
The issue is that .dwo files should not contain any relocations, as they
are not processed by the linker. However, relaxable code emits
relocations in DWARF for debug ranges that reside in the .dwo file when
DWARF fission is enabled.

This patch makes DWARF fission compatible with RISC-V relaxations. It
uses the StartxEndx DWARF forms in .debug_rnglists.dwo, which allow
referencing addresses from .debug_addr instead of using absolute
addresses. This approach eliminates relocations from .dwo files.
…#682)

AMDGPU requires more complex CFI rules, normally these would be
expressed with .cfi_escape, however this would make the CFI unreadable
and makes it difficult to update registers in CFI instructions (also
something AMDGPU requires).

Authored-by: Emma Pilkington <Emma.Pilkington@amd.com>

(cherry picked from commit fd94b41)
This PR adds codegen for `cir.await` ready and suspend. One notable
difference from the classic codegen is that, in the suspend branch, it
emits an `AwaitSuspendWrapper`(`.__await_suspend_wrapper__init`)
function that is always inlined. This function wraps the suspend logic
inside an internal wrapper that gets inlined. Example here:
https://godbolt.org/z/rWYGcaaG4
…" (llvm#169546)

This reverts commit f67409c.

cc @fiigii 
Including us, several separate groups are experiencing regressions with
this change. This is the smallest reproducer pasted by @akuegel :
llvm#162930 (comment)
Allows construction of ErrorAsOutParameters from Error references.
…m#169565)

Use `getNestedDoConstruct` from Utils to get the nested DoConstructs.

Fixes llvm#169532
Implementation similar to the clang one in
`clang/lib/Headers/__clang_cuda_intrinsics.h`
This makes all of the clangd tests work with the internal shell.
Modifications needed for each test are as follows:
1. system-include-extractor.test was using variable expansion which is
   not supported in the internal shell. This patch rewrites it to use
   the readfile mechanism along with python. This isn't super pretty but
   is readily understandable and there are only two tests across the
   monorepo that use this construction, so making it prettier is hard to
   justify.
2. include-cleaner-batch-fix.test - Was using $'' construction to create
   new lines in a string. Simply replace it with multiple echo commands
   to be canonical with the rest of the repository.
3. index-tools.test - Just add IndexBenchmark to the clangd test
   depends, so the test now just works unconditionally. This should
   significantly increase test coverage at little cost.

Reviewers: ilovepi, HighCommander4, petrhosek, kadircet

Reviewed By: ilovepi

Pull Request: llvm#169539
Enable it now that all of the tests pass under the internal shell. The
internal shell is slightly faster (10-15%) and also provides a better
debugging experience.

Reviewers: petrhosek, ilovepi, kadircet, HighCommander4

Reviewed By: ilovepi

Pull Request: llvm#169540
Fixes smoke-dev mpi-reduce, mpi-allreduce.

Fixes errors of the form EmissaryMPI.h:54:3: error: unknown type name
'MPI_Datatype'.
This reverts commit 9c414c4.

This one is causing buildbot failures too at CMake configure time:
1. https://lab.llvm.org/buildbot/#/builders/193/builds/12452
Split from llvm#158900 it adds a PerThreadContainer that can use STL-like
indexed containers based on a slightly refactored PerThreadTable.

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
…uiltins (llvm#168666)

This PR extends __scoped_atomic builtins with inc and dec functions.
They map to LLVM IR `atomicrmw uinc_wrap` and `atomicrmw udec_wrap`.
These enable implementation of OpenCL-style atomic_inc / atomic_dec with
wrap semantics on targets supporting scoped atomics (e.g. GPUs).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Icohedron and others added 18 commits November 27, 2025 12:02
…-uses.ll` due to changes to ReplaceConstant (llvm#169848)

Fixes an LLVM DirectX codegen test after it broke due to llvm#169141

The CBuffer loads and GEPs are no longer duplicated when there are two
or more accesses within the same basic block.
This PR removes the duplicate check for CBuffer load and GEP from the
original test function `@f` and adds a new test function `@g` which
places duplicate CBuffer loads into separate basic blocks.
Indirect call instrumentation snippet uses x16 register in exit
handler to go to destination target

    __bolt_instr_ind_call_handler_func:
            msr  nzcv, x1
            ldp  x0, x1, [sp], llvm#16
            ldr  x16, [sp], llvm#16
            ldp  x0, x1, [sp], llvm#16
            br   x16	<-----

This patch adds the instrumentation snippet by calling instrumentation
runtime library through indirect call instruction and adding the wrapper
to store/load target value and the register for original indirect instruction.

Example:
            mov x16, foo

    infirectCall:
            adrp x8, Label
            add  x8, x8, #:lo12:Label
            blr x8

Before:

    Instrumented indirect call:
            stp     x0, x1, [sp, #-16]!
            mov     x0, x8
            movk    x1, #0x0, lsl llvm#48
            movk    x1, #0x0, lsl llvm#32
            movk    x1, #0x0, lsl llvm#16
            movk    x1, #0x0
            stp     x0, x1, [sp, #-16]!
            adrp    x0, __bolt_instr_ind_call_handler_func
            add     x0, x0, #:lo12:__bolt_instr_ind_call_handler_func
            blr     x0

    __bolt_instr_ind_call_handler:  (exit snippet)
            msr     nzcv, x1
            ldp     x0, x1, [sp], llvm#16
            ldr     x16, [sp], llvm#16
            ldp     x0, x1, [sp], llvm#16
            br      x16    <- overwrites the original value in X16

    __bolt_instr_ind_call_handler_func:  (entry snippet)
            stp     x0, x1, [sp, #-16]!
            mrs     x1, nzcv
            adrp    x0, __bolt_instr_ind_call_handler
            add     x0, x0, x0, #:lo12:__bolt_instr_ind_call_handler
            ldr     x0, [x0]
            cmp     x0, #0x0
            b.eq    __bolt_instr_ind_call_handler
            str     x30, [sp, #-16]!
            blr     x0     <--- runtime lib store/load all regs
            ldr     x30, [sp], llvm#16
            b       __bolt_instr_ind_call_handler

_________________________________________________________________________

After:

            mov     x16, foo
    infirectCall:
            adrp    x8, Label
            add     x8, x8, #:lo12:Label
            blr     x8

    Instrumented indirect call:
            stp     x0, x1, [sp, #-16]!
            mov     x0, x8
            movk    x1, #0x0, lsl llvm#48
            movk    x1, #0x0, lsl llvm#32
            movk    x1, #0x0, lsl llvm#16
            movk    x1, #0x0
            stp     x0, x30, [sp, #-16]!
            adrp    x8, __bolt_instr_ind_call_handler_func
            add     x8, x8, #:lo12:__bolt_instr_ind_call_handler_func
            blr     x8       <--- call trampoline instr lib
            ldp     x0, x30, [sp], llvm#16
            mov     x8, x0   <---- restore original target
            ldp     x0, x1, [sp], llvm#16
            blr     x8       <--- original indirect call instruction

    // don't touch regs besides x0, x1
    __bolt_instr_ind_call_handler:  (exit snippet)
            ret     <---- return to original function with indirect call

    __bolt_instr_ind_call_handler_func: (entry snippet)
            adrp    x0, __bolt_instr_ind_call_handler
            add     x0, x0, #:lo12:__bolt_instr_ind_call_handler
            ldr     x0, [x0]
            cmp     x0, #0x0
            b.eq    __bolt_instr_ind_call_handler
            str     x30, [sp, #-16]!
            blr     x0     <--- runtime lib store/load all regs
            ldr     x30, [sp], llvm#16
            b       __bolt_instr_ind_call_handler
Upstream TryCall Op as a prerequisite for Try Catch work

Issue llvm#154992
Add a variant of m_Intrinsic that matches a variable runtime ID.
…lvm#169338)

In some case, VPWidenPointerInductions become only used by scalars after
legalizeAndOptimizationInducftions was already run, for example due to
some VPlan optimizations.

Move the code to scalarize VPWidenPointerInductions to a helper and use
it if needed.

This fixes a crash after llvm#148274 in the added test case.

Fixes llvm#169780
…ses.ll` more strict (llvm#169855)

Continuation of PR llvm#169848 to address PR comments.

This PR makes the test more strict by adding CHECKs to ensure the loads
are indeed using the same or different GEPs.
This pull request addresses an issue encountered when building
**libcxx** with certain configurations (`-D_LIBCPP_HAS_MUSL_LIBC` &
`-D__linux__`) that lack the `_GNU_SOURCE` definition. Specifically,
this issue arises if the system **musl libc** is built with
`_BSD_SOURCE` instead of `_GNU_SOURCE`. The resultant configuration
leads to problems with the "Strtonum functions" in the file
[libcxx/include/__locale_dir/support/linux.h](https://github.com/llvm/llvm-project/tree/master/libcxx/include/__locale_dir/support/linux.h),
affecting the following functions:

- `__strtof`
- `__strtod`
- `__strtold`

**Error messages displayed include**:
```console
error: no member named 'strtof_l' in the global namespace
```
```console
error: no member named 'strtod_l' in the global namespace
```
```console
error: no member named 'strtold_l' in the global namespace
```

For more insight, relevant code can be accessed
[here](https://github.com/llvm/llvm-project/blob/79cd1b7a25cdbf42c7234999ae9bc51db30af1f0/libcxx/include/__locale_dir/support/linux.h#L85-L95).
…69458)

Summary
======
This PR update the schedule for online sync-up and update link for past
meeting slides.

Changes
======
* Remove the wednesday schedule, since we did not have the meeting for
Americas-friendly timezones.
* Use a single folder for past meeting slides instead of individual
links.

Related Links
=========
* [Meeting materials for Qualification Working
Group](https://llvm.org/docs/QualGroup.html#meeting-materials)
* [Online
Sync-Ups](https://llvm.org/docs/GettingInvolved.html#online-sync-ups)

---------

Signed-off-by: ZakyHermawan <zaky.hermawan9615@gmail.com>
…vm#166684)

The `arith.cmpf` lowering pattern used to generate invalid IR when an
unsupported floating-point type was used.
…ations. (llvm#169273)

This is achieved by using some of the bits of RelType to tag vendor namespaces. This change also adds a relocation iterator for RISCV that folds vendor namespaces into the RelType of the following relocation.

This patch is extracted from the implementation of RISCV vendor-specific relocations in the CHERIoT LLVM downstream: CHERIoT-Platform/llvm-project@3d6d6f7
@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@z1-cciauto
Copy link
Collaborator

@cdevadas cdevadas merged commit 91f1bb6 into amd-feature/wave-transform Nov 28, 2025
45 checks passed
@cdevadas cdevadas deleted the amd/dev/cdevadas/wave-transform/merge-from-stg-nov-28 branch November 28, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.