Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Nov 27, 2025

No description provided.

pcc and others added 23 commits November 26, 2025 12:37
Deactivation symbols are a mechanism for allowing object files to disable
specific instructions in other object files at link time. The initial use
case is for pointer field protection.

For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

Reviewers: ojhunt, nikic, fmayer, arsenm, ahmedbougacha

Reviewed By: fmayer

Pull Request: llvm#133536
Deactivation symbol operands are supported in the code generator by
building on the previously added support for IRELATIVE relocations.

Reviewers: ojhunt, fmayer, ahmedbougacha, nikic, efriedma-quic

Reviewed By: fmayer

Pull Request: llvm#133537
Just required wiring up some additional AMDGPU table generated files.
The Language Environment (LE) reserves 128 byte for the argument area
when the optional field is not present. If the argument area is larger,
then the field must be present to guarantee that the space is reserved
on stack extension. Creating this field when alloca() is used may reduce
the needed stack space in case alloca() causes a stack extension.
This patch fixes:

  llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp:245:25: error: unused
  variable 'TII' [-Werror,-Wunused-variable]
…lvm#169720)

This commit introduces the ACCImplicitDeclare pass to the OpenACC
dialect, complementing ACCImplicitData by handling global variables
referenced in OpenACC compute regions and routines.

Overview:
---------
The pass applies implicit `acc declare` actions to global variables
referenced in OpenACC regions. While the OpenACC spec focuses on
implicit data mapping (handled by ACCImplicitData), implicit declare is
advantageous and required for specific cases:

1. Globals referenced in implicit `acc routine` - Since data mapping
only applies to compute regions, globals in routines must use `acc
declare`.

2. Compiler-generated globals - Type descriptors, runtime names, and
error reporting strings introduced during compilation that wouldn't be
visible for user-provided `acc declare` directives.

3. Constant globals - Constants like filename strings or initialization
values benefit from being marked with `acc declare` rather than being
mapped repeatedly (e.g., 1000 kernel launches shouldn't map the same
constant 1000 times).

Implementation:
---------------
The pass performs this in two phases:

1. Hoisting: Non-constant globals in compute regions have their
address-of operations hoisted out of the region when possible, allowing
implicit data mapping instead of declare marking.

2. Declaration: Remaining that must be device available (constants,
globals in routines, globals in recipe operations) are marked with the
acc.declare attribute.

The pass processes:
- OpenACC compute constructs (parallel, kernels, serial)
- Functions marked with acc routine
- Private, firstprivate, and reduction recipes (when used)
- Initialization regions of existing declared globals

Requirements:
-------------
The pass requires operations to implement:
- acc::AddressOfGlobalOpInterface (for address-of ops)
- acc::GlobalVariableOpInterface (for global definitions)
- acc::IndirectGlobalAccessOpInterface (for indirect access)
…est (llvm#169717)

Otherwise the test can fail in weirder setups (like ours downstream
where the actual binary path only contains the hash of the object). This
makes the test more resilient, more consistent with other driver tests,
and allows us to assert that the binary is named clang rather than
clang-<some suffix>.
)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant

The following utilities have been annotated in this patch:

- [x] `<barrier>`
- [x] `<condition_variable>`
- [x] `<latch>`
- [x] `<mutex>`
- [x] `<semaphore>`
- [x] `<thread>`

N.B. Some classes don't provide all specified methods, which were not
annotated.
This change adds undef handling that was needed to enable global
lambdas. There was no lambda-specific code needed, but the global lambda
handling needed to initialize a global with an undef value.

[CIR] Handle undef init of struct

This adds handling for a case where Clang initializes a struct to undef
with a constant copy. This required adding support for undef constants
and lowering undef attributes to LLVM IR.
This moves all builtin-related CodeGen tests to a new directory,
separate from the main clang/test/CIR/CodeGen directory. This will make
it easier to run the basic CodeGen tests without running the builtin
tests. This is specifically intended to move those tests which include
`immintrin.h` or any of its variants, which take a very long time to
compile with a debug build.
…lvm#157917)

Further to
llvm#147134 (comment),
switch to use the madvise() api to page in mmap'd files and

1) All new code compiled in #if LLVM_ENABLE_THREADS is set so it can be
seen where the changes were from this PR.
2) The new PR moves to use madvise() instead of the ad-hoc page
referencing code I wrote which should avoid SIGSEGVs if the buffer is
deallocated.
3) A new property SerialBackgroundQueue().stopAllWork to be used to stop
background workers when there is no further call for them. Usually the
background "page-in" threads have completed first but it seems with this
troublesome test this is not always the case and buffers stored in the
static input file cache are being deallocated while being referenced.

---------

Co-authored-by: James Henderson <James.Henderson@sony.com>
…ing dependencies (llvm#169461)

Adds requirements.txt and lock files for installing dependencies for
github-upload-release.py script.

Signed-off-by: Ryan Mast <mast.ryan@gmail.com>
This patch significantly optimizes the LiveElementPrinter
by replacing a slow linear search with efficient hash map
lookups. It refactors the code to use a map-based system
for tracking live element addresses and managing column
assignments, leading to a major performance improvement
for large binaries.
…9747)

WaitingOnGraph::processReadyOrFailed was not clearing stale entries from
the ElemToPendingSN map. If symbols were removed from the
ExecutionSession and then re-added this could lead to dependencies on
the stale entries, triggering a use-after-free bug.

llvm#169135
…169624)

An existing code can be further simplified.

---------

Co-authored-by: Matej Košík <matej.kosik@codasip.com>
…169625)

Add arguemnt print for test-liveness-analysis to better debug
remove-dead-values pass.

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
@ronlieb ronlieb requested review from a team and dpalermo November 27, 2025 02:21
@z1-cciauto
Copy link
Collaborator

@z1-cciauto z1-cciauto merged commit b9c347b into amd-staging Nov 27, 2025
16 checks passed
@z1-cciauto z1-cciauto deleted the amd/merge/upstream_merge_20251126200328 branch November 27, 2025 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.