Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Nov 11, 2025

No description provided.

naveen-seth and others added 30 commits November 10, 2025 21:24
…w clangOptions library" (llvm#167374)

This relands llvm#167348.

The original PR was reverted due to a reported build failure, which was
later diagnosed as a local issue in the developer’s checkout or build
state. See discussion here:
llvm#163659 (comment)

No additional changes have been made in this reland.
This isn't reachable today but will come into play once we reorder
passes for llvm#147352 and llvm#147351.

Note that the `CBufferRowIntrin` helper struct is copied from the
`DXILCBufferAccess` pass, but it will be removed from there when we
simplify that pass in llvm#147351
trans bf16 insts do not support clamp and omod
…m#167276)

Preparing to moving most of implementation out of the header file.

* llvm#167280

---------

Co-authored-by: Naveen Seth Hanig <naveen.hanig@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…lvm#167383)

Re-land: Add test for Complex imag literal GNU extension after updating
the name
…#152830)

This adds the codegen support for the dyn_groupprivate clause.
It is undefined behavior to call a function with a mismatched calling
convention. Rather than crash on this behavior, it should compile.

This LLVM defect was identified via the AMD Fuzzing project.
…166658) (llvm#166846)

The previous implementation in llvm#166252 (rolled back in llvm#166658) caused
buildbot failures due to a bug in an added test. The modified
`UseAfterClose` did not pass a `struct flock` value to `fcntl`:


https://github.com/llvm/llvm-project/blob/2d5170594147b42a37698760d6e0194eec4f1390/libc/test/src/fcntl/fcntl_test.cpp#L175

Which ASAN caught and errored in the `fcntl` implementation when the
unspecified argument was accessed:


https://github.com/llvm/llvm-project/blob/c12cb2892c808af459eaa270b8738a2b375ecc9b/libc/src/__support/OSUtil/linux/fcntl.cpp#L59
Use getDefiningRecipe to future-proof the code. Split off from
llvm#156262 as suggested.
This commit introduces `SpecialCaseList::Match`, a small struct to hold
the matched rule and its line number. This simplifies the `match`
methods by allowing them to return a single value instead of using a
callback.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Make the systemSupportsMemoryTagging() function return even on system
that don't support memory tagging. This avoids the need to always check
if memory tagging is supported before calling th function.

Make systemSupportsMemoryTagging() cache the getauxval return value
instead of calling the function every time.

Updated the code that calls systemSupportsMemoryTagging().
This hardens the code to check based on WideMember0's operands. This
ensures each call will go through the same check. Should be NFC
currently but needed when generalizing in follow-up patches.
)

Refactor computeWidth from CUFOpConversion into a shared helper function
computeElementByteSize in CUFCommon.
…m#153608)

This should be the peephole's job. Because and sets V flag to 0, this is
why signed comparisons with 0 are okay to replace with tst. Note this is
only for AArch64, because ANDS on ARM leaves the V flag the same.

Fixes: llvm#154387
… of pattern matching (llvm#166806)

Flang alias analysis used to find allocation site by pattern matching
allocation ops in mainly FIR dialect. This MR extends the
characterization to instead characterize based on whether the result of
an op has MemAlloc effect.
After llvm#164165, we emit warnings
from non-system headers by default.
This change only preserves functionality of `clang-tidy` as it was
before the change.
…m#158224)

Both conceptually belong to the same subtarget, so it should not
be necessary to pass in the context TargetRegisterInfo to any
TargetInstrInfo member. Add this reference so those superfluous
arguments can be removed.

Most targets placed their TargetRegisterInfo as a member
in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo,
so unify all targets to look the same.
The save stats test would fail downstream due to the tests being run
from a read only directory. The cwd flag would attempt to place the
statistics in that read only directory, causing an error. This patch
ensures that the tests are always run from a temp directory.
llvm#167350)

fwrite and friends don't modify errno if no error occurred. Therefore
frite_unlocked's return value shouldn't be constructed from errno
without checking if an error actually occurred.

This fixes an error introduced by
llvm@9e2f73f
This is merging loads and stores so use the combined DebugLoc.

Not sure if computeBase should be using the merged location from
all the involved instructions. I'm also not sure how to test this
sort of thing.
Michael137 and others added 22 commits November 11, 2025 09:29
… names (llvm#167293)

We shouldn't be calling `getUnversionedName` unconditionally because
DWARFv6-enabled Clang will emit versioned language names that are
versioned (i.e., `sourceLanguageName`...see new test case). We need to
check `hasVersionedName` and pick the appropriate API based on that.
Otherwise we assert in `getUnversionedName`.
When transforming floating-point induction variables into integer ones,
make sure we stay within the bounds of fp values that can be represented
as integers without gaps, i.e., 2^24 and 2^53 for IEEE-754 single and
double precision respectively (both on negative and positive side).

Fixes: llvm#166496.
…6395)

So we know before _what_ entry in the chain we need to look for the
InitList.

Fixes llvm#166171
This means that VPExpressions will now be constructed for
VPPartialReductionRecipe's when the loop has tail-folding predication.

Note that control-flow (if/else) predication is not yet handled
for partial reductions, because of the way partial reductions
are recognised and built up.
…nux) (llvm#165451)

This patch reimplements the SME ABI `__arm_za_disable` routine within
libunwind. This routine must be called before resuming from unwinding on
AArch64 platforms with SME support.

Before calling the routine, we need to check that SME is available. In
this patch, this is implemented for Linux-based platforms by checking
HWCAP2. It should be possible to implement this check for other
platforms as required.

This patch includes a test for this functionality. This test requires
SME, so on platforms without it, it will simply pass.
llvm#161635)

`aligned_storage` has been deprecated and will most likely be removed in
a future version of C++. This patch removes some of its uses to avoid
having to work around its removal in the future.
llvm#167255)

Patch updates transform.memref.erase_dead_alloc_and_stores to not delete
escaped allocations.
The original problem does not reproduce anymore, but add the test case.

Fixes llvm#163563
This allows us to strip a special case in VPWidenGEP::execute.
This patch adds cost model tests for `bfloat` operations with
`+sve-b16b16`. Currently, some of these costs are higher than they
should be as the cost model is assuming `bfloat`s need promotion, but
some of these operations are natively supported with `+sve-b16b16`.
After llvm#167359 / 95db31e.

A fix was attempted in llvm#167423 but was not quite enough.

From what I could understand, in v1 format you have to specify
all the basic blocks. Where before !call_me implied they were all
cold (I think, very shaky understanding here).

For this test we want to see blocks like call_me/foo/call_me.
So adding a line for block 1 fixes the tests.

It could produce more blocks at some point but I think as long
as foo is within two of them, it'll be fine.
… into new clangOptions library" (llvm#167374)"

This reverts commit f63d33d.
)

Currently the only way to enable the use of wide active lane masks is to pass
-enable-wide-lane-mask and force both interleaving & tail-folding with additional
flags. This patch changes selectInterleaveCount to consider interleaving if wide
lane masks were requested, although the feature remains off by default.
)

This patch proposes replacing the inline assembly with an intrinsic and
adds the `noreturn` to avoid warnings when the caller is also
`noreturn`.
Adds `transform.xegpu.set_gpu_launch_threads` that overrides `gpu.launch` operation threads.
@z1-cciauto
Copy link
Collaborator

@z1-cciauto z1-cciauto merged commit c2e90ff into amd-staging Nov 11, 2025
18 checks passed
@z1-cciauto z1-cciauto deleted the amd/merge/upstream_merge_20251111061038 branch November 11, 2025 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.