Skip to content

Conversation

@z1-cciauto
Copy link
Collaborator

No description provided.

rampitec and others added 30 commits November 19, 2025 10:24
…168163)

This is to show the spilling of WMMA scale values which are limited
to low 256 VGPRs. We have free registers, just RA allocates low 256
first.
Currently LibcallLoweringInfo is defined inside of TargetLowering,
which is owned by the subtarget. Pass in the subtarget so we can
construct LibcallLoweringInfo with the subtarget. This is a temporary
step that should be revertable in the future, after LibcallLoweringInfo
is moved out of TargetLowering.
…e time. (llvm#163468)

Document a define to allow library developers to support disabling 
AddressSanitizer's container overflow detection in template code at 
compile time.

The primary motivation is to reduce false positives in environments
where
libraries and frameworks that cannot be recompiled with sanitizers
enabled
are called from application code. This supports disabling checks when
the
runtime environment cannot be reliably controlled to use ASAN_OPTIONS.

Key changes:
- Use the define `__SANITIZER_DISABLE_CONTAINER_OVERFLOW__` to disable
  instrumentation at compile time
- Implemented redefining the container overflow APIs in
common_interface_defs.h
  to use define to provide null implementation when define is present
- Update documentation in AddressSanitizer.rst to suggest and illustrate
  use of the define
- Add details of the define in PrintContainerOverflowHint()
- Add test disable_container_overflow_checks to verify new hints on the
error and fill the testing gap that
ASAN_OPTIONS=detect_container_overflow=0
  works
- Add tests demonstrating the issue around closed source libraries and 
  instrumented apps that both modify containers

This requires no compiler changes and should be supportable cross
compiler toolchains.

An RFC has been opened to discuss: 

https://discourse.llvm.org/t/rfc-add-fsanitize-address-disable-container-overflow-flag-to-addresssanitizer/88349
This patch changes the type of SubRegIndicesSize to size_t.  The
original type deduced for "auto" is a signed type, but size_t, an
unsigned type, is safe here according to the usage.
…get LHS for complex compound assignment (llvm#166798)

- Fixes llvm#166512
- `ComplexExprEmitter::EmitCompoundAssignLValue` is calling
`EmitLoadOfScalar(LValue, SourceLocation)` to load the LHS value in the
case that it's non-complex, however this function requires that the
value is a simple LValue - issue occurred because the LValue in question
was a bitfield LValue. I changed it to use this function which seems to
handle all of the different cases (deferring to the original
`EmitLoadOfScalar` if it's a simple LValue)
R_AARCH64_FUNCINIT64 is a dynamic relocation type for relocating
word-sized data in the output file using the return value of
a function. An R_AARCH64_FUNCINIT64 shall be relocated as an
R_AARCH64_IRELATIVE with the target symbol address if the target
symbol is non-preemptible, and it shall be a usage error to relocate an
R_AARCH64_FUNCINIT64 with a preemptible or STT_GNU_IFUNC target symbol.

The initial use case for this relocation type shall be for emitting
global variable field initializers for structure protection. With
structure protection, the relocation value computation is tied to the
compiler implementation in such a way that it would not be reasonable to
define a relocation type for it (for example, it may involve computing
a hash using a compiler-determined algorithm), hence the need for the
computation to be implemented as code in the binary.

Part of the AArch64 psABI extension:
ARM-software/abi-aa#340

Reviewers: smithp35, fmayer, MaskRay

Reviewed By: fmayer

Pull Request: llvm#156564
This adds the scalar expression visitor needed to handle default
arguments being passed to constructors.
vulkan_sdk_setup is the name of the method that configures it, but the
repo itself has the name vulkan_sdk

This was caught by enabling the bzlmod flag for CI. The GH action runs
`blaze test @llvm-project/...` but the target is tagged manual, so it's
excluded. The buildkite CI runs `bazel query | xargs bazel test` which
will include manual targets.
Clean up some of the existing predicated load/store sink/hosting tests
and add additional test coverage for more complex cases.
Given a set of pointers, check if they can be rearranged as follows (%s is a constant):
%b + 0 * %s + 0
%b + 0 * %s + 1
%b + 0 * %s + 2
...
%b + 0 * %s + w

%b + 1 * %s + 0
%b + 1 * %s + 1
%b + 1 * %s + 2
...
%b + 1 * %s + w
...

If the pointers can be rearanged in the above pattern, it means that the
memory can be accessed with a strided loads of width `w` and stride `%s`.
…llvm#168026)

The previous implementation of `combineOp_VLToVWOp_VL` manually replaced
old
nodes with newly created widened nodes, but only added the new node
itself to
the `DAGCombiner` worklist. Since the users of the new node were not
added,
some combine opportunities could be missed when external `DAGCombiner`
passes
expected those users to be reconsidered.

This patch replaces the custom replacement logic with a call to
`DCI.CombineTo()`, which performs node replacement in a way consistent
with
`DAGCombiner::Run`:
- Replace all uses of the old node.
- Add the new node and its users to the worklist.
- Clean up unused nodes when appropriate.

Using `CombineTo` ensures that `combineOp_VLToVWOp_VL` behaves
consistently with
the standard `DAGCombiner` update model, avoiding discrepancies between
the
private worklist inside this routine and the global worklist managed by
the
combiner.

This resolves missed combine cases involving VL -> VW operator widening.

---------

Co-authored-by: Kai Lin <omg_link@qq.com>
…m#168642)

This is not needed now that we are using the container to run the
workflow.
It doesn't look like these tests are actually run on Windows, so we
don't need it.
We don't need this anymore since all new contributors in the last year
have applied for commit access using GitHub issues. There is already
code in the script that removes anyone who submitted a request, so we
don't need the old code any more (which was way too conservitave and
very slow).
…7815)

To prepare for more backends to use Mustache templates, this patch lifts
the Mustache utilities from `HTMLMustacheGenerator.cpp` to
`Generators.h`. A MustacheGenerator interface is created to share code for
template creation.
The collection of library function names in TargetLibraryInfo faces
similar challenges as RuntimeLibCalls in the IR component. The number of
function names is large, there are numerous customizations based on the
triple (including alternate names), and there is a lot of replicated
data in the signature table.

The ultimate goal would be to capture all lbrary function related
information in a .td file. This PR brings the current .def file to
TableGen, almost as a 1:1 replacement. However, there are some
improvements which are not possible in the current implementation:

- the function names are now stored as a long string together with an
offset table.
- the table of signatures is now deduplicated, using an offset table for
access.

The size of the object file decreases about 34kB with these changes. The
hash table of all function names is still constructed dynamically. A
static table like for RuntimeLibCalls is the next logical step.

The main motivation for this change is that I have to add a large number
of custom names for z/OS (like in RuntimeLibCalls.td), and the current
infrastructur does not support this very well.
)

Using the builtin failed on 32-bit architectures:
```
clang/lib/AST/ExprConstant.cpp:14299: [..]: Assertion `I.getBitWidth() == Info.Ctx.getIntWidth(E->getType()) && "Invalid evaluation result."' failed.
```

The return type is meant to be size_t. Fix it.
I found that the issues we've been seeing in the HTML
whitespace/alignment are due to partials inserting their own whitespace
and calling partials on indented lines or lines containing text already.
This patch gets rid of unnecessary whitespace in the comment and
function partials so that they are properly indented when inserted.
The test was added in llvm#147252. On a 32-bit target, it fails with error:
```
  File "...\TestDataFormatterLibcxxInvalidString.py", line 23, in test
    self.skip()
    ^^^^^^^^^
AttributeError: 'LibcxxInvalidStringDataFormatterTestCase' object has no attribute 'skip'
```
jhuber6 and others added 19 commits November 19, 2025 15:56
Summary:
We start this thread if the RPC client symbol is detected in the loaded
binary. We should make this sleep if there's no work to avoid the thread
running at high priority when the (scarecely used) RPC call is actually
required. So, right now after 25 microseconds we will assume the server
is inactive and begin sleeping. This resets once we do find work.

AMD supports a more intelligent way to do this. HSA signals can wake a
sleeping thread from the kernel, and signals can be sent from the GPU
side. This would be nice to have and I'm planning on working with it in
the future to make this infrastructure more usable with existing AMD
workloads.
This PR upstreams `cir.await` and adds initial codegen for emitting a
skeleton of the ready, suspend, and resume branches. Codegen for these
branches is left for a future PR. It also adds a test for the invalid
case where a `cir.func` is marked as a coroutine but does not contain a
`cir.await` op in its body.
Replace retrieving FMFs for in-loop reduction via underlying instruction
+ legal by collecting the flags during reduction chain traversal in
VPlan.
It appears that this broke the build by not using the 'correct' name for
the expression. This is probably something that crossed in review.
In my last patch, it became clear during code review that the postfix
operation was actually a read THEN update, not update/read like other
single line versions. It wasn't clear at the time how much additional
work this would be to make postfix work correctly (and they are a bit of
a 'special' thing in codegen anyway), so this patch adds some
functionality to sense this and special-cases it when generating the
statement info for capture.
This implements null base class initialization for non-empty bases.
Use `IfGuardEmitter` in CallingConvEmitter. Additionally refactor the
code a bit to extract duplicated code to emit the CC function prototype
into a helper function.
…#168783)

I saw this while doing lowering, we were not properly looking into the
array sections for the variable. Presumably we didn't do a good job of
making sure we did this right when making this extension, and missed
this spot.
…ypes (llvm#163465)

This change adds resource handle type `__hlsl_resource_t` to the list of types recognized in the Clang's built-in functions prototype string.

HLSL has built-in resource classes and some of them have many methods, such as
[Texture2D](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-texture2d).
Most of these methods will be implemented by built-in functions that will take resource handle as an argument. This change enables us to move from generic `void(...)` prototype string for these methods and explicit argument checking in `SemaHLSL.cpp` to a prototype string with explicit argument types. Argument checking in `SemaHLSL.cpp` can be reduced to handle just the rules that cannot be expressed in the prototype string (for example verifying that the offset value in `__builtin_hlsl_buffer_update_counter` is `1` or `-1`).

In order to make this work, we now allow conversions from attributed resource handle type such as `__hlsl_resource_t [[hlsl::resource_class(UAV)]] [[hlsl::contained_type(float)]]` to a plain non-attributed `__hlsl_resource_t` type.
…68382)

As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as
part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark
`_BitInt`s as "reconstitutable" when compiling with
`-gsimple-template-names`. However, before doing so we need to make sure
the `DWARFTypePrinter` can reconstruct template parameter values that
have `_BitInt` type. This patch adds support for printing
`DW_TAG_template_value_parameter`s that have `_BitInt` type. Since
`-gsimple-template-names` only omits template parameters that are `<=
64` bit wide, we don't support `_BitInt`s larger than 64 bits.
llvm#168762)

I'm not aware of any way for `%run` wrapper scripts like
`iosssim_run.py`
([ref](https://github.com/llvm/llvm-project/blob/d2c7c6064259320def7a74e111079725958697d4/compiler-rt/test/sanitizer_common/ios_commands/iossim_run.py#L4))
to know what testcase they are currently running. This can be useful if
these wrappers need to create a (potentially remote) temporary directory
for each test case.

This adds the `LIT_CURRENT_TESTCASE` environment variable to both the
internal shell and the external shell, containing the full name of the
current test being run.
…cialCaseList.cpp llvm#168088 (llvm#168779)

This test has long call chain in recursion. Search tree can be pruned
early by swapping CC test and recursive simplifyAssumingCCVal.

Fixes: llvm#168088
Co-authored-by: anoopkg6 <anoopkg6@github.com>
Another miss when working through 'link', we didn't properly handle
giving the whole array-section expression or array index expression,
instead allowed it to only get the decl-ref-expr.  This patch makes
sure we don't add the wrong thing.
…simple-template-names (llvm#168383)

Depends on:
* llvm#168382

As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as
part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark
`_BitInt`s as "reconstitutable" when compiling with
`-gsimple-template-names`. We still only omit template parameters that
are `<= 64` bit wide. So support `_BitInt`s larger than 64 bits is not
part of this patch.
… intrinsics. (llvm#168668)

We can constant fold interleave of identical splat vectors to a larger
splat vector.
Ref commit in incubator: ee17ff6
 
There is a minor change in the assumption for emitting a direct callee.
In incubator, `bool hasAttributeNoBuiltin = false`
(`llvm-project/clang/lib/CIR/CodeGen/CIRGenExpr.cpp:1671`), while in
upstream, it's true, therefore, the call to finite(...) is not converted
to a builtin anymore.

Fixes llvm#163892
@z1-cciauto z1-cciauto requested a review from a team November 19, 2025 23:54
@z1-cciauto
Copy link
Collaborator Author

@z1-cciauto z1-cciauto merged commit 0d7728c into amd-staging Nov 20, 2025
10 checks passed
@z1-cciauto z1-cciauto deleted the upstream_merge_202511191854 branch November 20, 2025 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.