Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Nov 10, 2025

xfails: llvm/test/DebugInfo/AMDGPU/heterogeneous-dwarf-diop-diexpression-address-spaces.ll

lukel97 and others added 28 commits November 10, 2025 12:10
… blocks (llvm#160449)

Split off from llvm#158690. Currently if an instruction needs predicated due
to tail folding, it will also have a predicated discount applied to it
in multiple places.
This is likely inaccurate because we can expect a tail folded
instruction to be executed on every iteration bar the last.

This fixes it by checking if the instruction/block was originally
predicated, and in doing so prevents vectorization with tail folding
where we would have had to scalarize the memory op anyway.

On llvm-test-suite this causes 4 loops in total to no longer be
vectorized with -O3 on arm64-apple-darwin, and there's no observable
performance impact.
…lvm#166855)

These checks ensure that retained nodes of a DISubprogram belong to the
subprogram.

Tests with incorrect IR are fixed. We should not have variables of one subprogram present in retained nodes of other subprograms.

Also, interface for accessing DISubprogram's retained nodes is slightly
refactored. `DISubprogram::visitRetainedNodes` and
`DISubprogram::forEachRetainedNode` are added to avoid repeating checks
like
```
if (const auto *LV = dyn_cast<DILocalVariable>(N))
  ...
else if (const auto *L = dyn_cast<DILabel>(N))
  ...
else if (const auto *IE = dyn_cast<DIImportedEntity>(N))
  ...
```
This paper allows use of * in a multidimensional array extent within a
_Generic selection association, as a wildcard for any array extent.

Clang does not currently support this feature, so this is just some
initial test coverage along with an update to the conformance site.
…lvm#166202)

`DISubprogram`s are attached to call sites to support various debug info
features, including entry values and tail calls. Clang 9.0
(0f65168) was the first version to
include this kind of call site `DISubprogram` attachment.

This earlier work appears to visit only some call site variants,
however. The call site attachment was added to a higher-level `EmitCall`
path in Clang's code gen that is only used by some call variants. In
particular, some C++ member calls use a different code gen path, which
did not include this call site attachment step, and thus the debug info
it triggers (e.g. call site entries) was not emitted for such calls.

This moves `DISubprogram` attachment to a lower-level call emission path
that is used by all call variants.

Fixes llvm#161962
… mode (llvm#166576)

Fixes a bug causing every conversion to fail fatally with "expected
pattern to replace the root operation or modify it in place" when
`MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled and pattern
rollback is disabled.

When `allowPatternRollback` is disabled, the rewriter does not keep
track of the rewrites it performs and can therefore not use that list to
check whether the root op was replaced or updated in place.
Add an error test to check that a suitable error diagnostic is generated
for the use of the GL::unpackhalf2x16 operation in
 invalid contexts.

Fixes llvm#166965

Co-authored-by: Tim Corringham <tcorring@amd.com>
…eger comparisons (llvm#166778)

A generic alternative to llvm#166564 - make the assumption that expanding
integer comparisons will be expensive if they are larger than the largest
legal type so avoid sinking if they are also used in the current BB + any phis.

Fixes llvm#166534
…m#167226)

It's supported together with the other spellings and results in the same attribute.
Document it and prefer it in the documentation as the `asm()` spelling is C++ and GNU-only.

See: llvm#167221 (comment)
Add `transform.xegpu.get_desc_op` transform op that finds a
`xegpu.create_nd_tdesc` producer op of a `Value`.
Added CONSTEXPR macro and test for the following intrinsics:

-- _mm_mask_adds_epi16 _mm_maskz_adds_epi16
-- _mm_mask_adds_epi8 _mm_maskz_adds_epi8
-- _mm_mask_adds_epu16 _mm_maskz_adds_epu16
-- _mm_mask_adds_epu8 _mm_maskz_adds_epu8
-- _mm_mask_broadcastb_epi8 _mm_maskz_broadcastb_epi8
-- _mm_mask_broadcastw_epi16 _mm_maskz_broadcastw_epi16
-- _mm_mask_cvtepi8_epi16 _mm_maskz_cvtepi8_epi16
-- _mm_mask_cvtepu8_epi16 _mm_maskz_cvtepu8_epi16
-- _mm_mask_packs_epi16 _mm_maskz_packs_epi16
-- _mm_mask_packs_epi32 _mm_maskz_packs_epi32
-- _mm_mask_packus_epi16 _mm_maskz_packus_epi16
-- _mm_mask_packus_epi32 _mm_maskz_packus_epi32
-- _mm_mask_set1_epi16 _mm_maskz_set1_epi16
-- _mm_mask_set1_epi8 _mm_maskz_set1_epi8
-- _mm_mask_slli_epi16 _mm_mask_slli_epi16
-- _mm_mask_subs_epi16 _mm_maskz_subs_epi16
-- _mm_mask_subs_epi8 _mm_maskz_subs_epi8
-- _mm_mask_subs_epu16 _mm_maskz_subs_epu16
-- _mm_mask_subs_epu8 _mm_maskz_subs_epu8
-- _mm_mask_unpackhi_epi16 _mm_maskz_unpackhi_epi16
-- _mm_mask_unpackhi_epi8 _mm_maskz_unpackhi_epi8
-- _mm_mask_unpacklo_epi16 _mm_maskz_unpacklo_epi16
-- _mm_mask_unpacklo_epi8 _mm_maskz_unpacklo_epi8

-- _mm256_mask_adds_epi16 _mm256_maskz_adds_epi16
-- _mm256_mask_adds_epi8 _mm256_maskz_adds_epi8
-- _mm256_mask_adds_epu16 _mm256_maskz_adds_epu16
-- _mm256_mask_adds_epu8 _mm256_maskz_adds_epu8
-- _mm256_mask_broadcastb_epi8 _mm256_maskz_broadcastb_epi8
-- _mm256_mask_broadcastw_epi16 _mm256_maskz_broadcastw_epi16
-- _mm256_mask_cvtepi8_epi16 _mm256_maskz_cvtepi8_epi16
-- _mm256_mask_cvtepu8_epi16 _mm256_maskz_cvtepu8_epi16
-- _mm256_mask_packs_epi16 _mm256_maskz_packs_epi16
-- _mm256_mask_packs_epi32 _mm256_maskz_packs_epi32
-- _mm256_mask_packus_epi16 _mm256_maskz_packus_epi16
-- _mm256_mask_packus_epi32 _mm256_maskz_packus_epi32
-- _mm256_mask_set1_epi16 _mm256_maskz_set1_epi16
-- _mm256_mask_set1_epi8 _mm256_maskz_set1_epi8
-- _mm256_mask_slli_epi16 _mm256_mask_slli_epi16
-- _mm256_mask_subs_epi16 _mm256_maskz_subs_epi16
-- _mm256_mask_subs_epi8 _mm256_maskz_subs_epi8
-- _mm256_mask_subs_epu16 _mm256_maskz_subs_epu16
-- _mm256_mask_subs_epu8 _mm256_maskz_subs_epu8
-- _mm256_mask_unpackhi_epi16 _mm256_maskz_unpackhi_epi16
-- _mm256_mask_unpackhi_epi8 _mm256_maskz_unpackhi_epi8
-- _mm256_mask_unpacklo_epi16 _mm256_maskz_unpacklo_epi16
-- _mm256_mask_unpacklo_epi8 _mm256_maskz_unpacklo_epi8

-- _mm512_mask_adds_epi16 _mm512_maskz_adds_epi16
-- _mm512_mask_adds_epi8 _mm512_maskz_adds_epi8
-- _mm512_mask_adds_epu16 _mm512_maskz_adds_epu16
-- _mm512_mask_adds_epu8 _mm512_maskz_adds_epu8
-- _mm512_mask_broadcastb_epi8 _mm512_maskz_broadcastb_epi8
-- _mm512_mask_broadcastw_epi16 _mm512_maskz_broadcastw_epi16
-- _mm512_mask_mov_epi16 _mm512_maskz_mov_epi16
-- _mm512_mask_mov_epi8 _mm512_maskz_mov_epi8
-- _mm512_mask_packs_epi16 _mm512_maskz_packs_epi16
-- _mm512_mask_packs_epi32 _mm512_maskz_packs_epi32
-- _mm512_mask_packus_epi16 _mm512_maskz_packus_epi16
-- _mm512_mask_packus_epi32 _mm512_maskz_packus_epi32
-- _mm512_mask_set1_epi16 _mm512_maskz_set1_epi16
-- _mm512_mask_set1_epi8 _mm512_maskz_set1_epi8
-- _mm512_mask_subs_epi16 _mm512_maskz_subs_epi16
-- _mm512_mask_subs_epi8 _mm512_maskz_subs_epi8
-- _mm512_mask_subs_epu16 _mm512_maskz_subs_epu16
-- _mm512_mask_subs_epu8 _mm512_maskz_subs_epu8
-- _mm512_mask_unpackhi_epi16 _mm512_maskz_unpackhi_epi16
-- _mm512_mask_unpackhi_epi8 _mm512_maskz_unpackhi_epi8
-- _mm512_mask_unpacklo_epi16 _mm512_maskz_unpacklo_epi16
-- _mm512_mask_unpacklo_epi8 _mm512_maskz_unpacklo_epi8

closes llvm#162070
Adds `transform.xegpu.set_op_layout_attr` transform op that attaches
`xegpu.layout` attribute to the target op.
…llvm#166961)

add test case to test lib call are used for the memmove milicode
…166459)

First, for internal variables, they are always global, so use the global
AS by default unless specified otherwise. We can't really use `0` as a
default like we do now because that has an actual meaning on some
targets, so we really need specified vs unspecified, so I used
`std::optional` which is already used in many places in OMPIRBuilder.

Second, for the critical lock variable, add an addrspace cast if needed.

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
mlir-opt's registerAndParseCLIOptions() forces users to both register
default MLIR options and parse the command line string. Custom mlir-opt
implementations, however, may need to provide own options or own
parsing. It seems that separating the two functions makes it easier to
achieve necessary customizations.

For example, one can register "default" options, then register custom
options (not available in standard mlir-opt), then parse all of them.
Other cases include two-stage parsing where some additional options
become available based on parsed information (e.g. compilation target
can allow additional options to be present).
…#166782)

According to SPIR-V spec:

> It is invalid to decorate any given id or structure member more than
one time with the same
[decoration](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Decoration),
unless explicitly allowed below for a specific decoration.

`FuncParamAttr` explicitly allows multiple uses of the decoration on the
same id, so this patch honors it.
This PR adds all the missing doc strings in IRCore.cpp. It also

1. Normalizes all doc strings to have proper punctuation;
2. Inlines non-duplicated docstrings which are currently at the top of
the source file (and thereby possibly out of sync).

Follow-up PRs will do the same for the rest of the modules/source files.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- unify isRAStateSigned and isRAStateUnsigned to a common getRAState,
- unify setRASigned and setRAUnsigned into setRAState(MCInst, bool),
- update users of these to match the new implementations.
…ubstNonTypeTemplateParmPackExpr, PseudoObjectExpr (llvm#160904)

Add new visit functions to ASTImporter for CXXParenListInitExpr,
SubstNonTypeTemplateParmPackExpr and PseudoObjectExpr.
On CTU analysis there are lot of "cannot import unsupported AST node"
for CXXParenListInitExpr, SubstNonTypeTemplateParmPackExpr and
PseudoObjectExpr. Problem occurred after full support of Concepts in
importer.
…4648)

Flang on Windows added `-latomic` to the link line. This library does
not exist on Windows and the linker gives a warning.
After llvm#163011 was merged, the tests in
[`offload/test/offloading/gpupgo`](https://github.com/llvm/llvm-project/compare/main...EthanLuisMcDonough:llvm-project:gpupgo-names-fix-pr?expand=1#diff-f769f6cebd25fa527bd1c1150cc64eb585c41cb8a8b325c2bc80c690e47506a1)
broke because the offload plugins were no longer able to find
`__llvm_prf_nm`. This pull request explicitly makes `__llvm_prf_nm`
visible to the host on GPU targets and reverses the changes made in
f7e9968.
@ronlieb ronlieb requested review from a team and dpalermo November 10, 2025 18:14
@z1-cciauto
Copy link
Collaborator

@z1-cciauto
Copy link
Collaborator

@z1-cciauto z1-cciauto merged commit 05bb629 into amd-staging Nov 11, 2025
9 checks passed
@z1-cciauto z1-cciauto deleted the amd/merge/upstream_merge_20251110110356 branch November 11, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.