merge main into amd-staging #629

z1-cciauto · 2025-11-19T23:54:42Z

No description provided.

…168163) This is to show the spilling of WMMA scale values which are limited to low 256 VGPRs. We have free registers, just RA allocates low 256 first.

…ll. NFC This matches how IR is printed.

Currently LibcallLoweringInfo is defined inside of TargetLowering, which is owned by the subtarget. Pass in the subtarget so we can construct LibcallLoweringInfo with the subtarget. This is a temporary step that should be revertable in the future, after LibcallLoweringInfo is moved out of TargetLowering.

…e time. (llvm#163468) Document a define to allow library developers to support disabling AddressSanitizer's container overflow detection in template code at compile time. The primary motivation is to reduce false positives in environments where libraries and frameworks that cannot be recompiled with sanitizers enabled are called from application code. This supports disabling checks when the runtime environment cannot be reliably controlled to use ASAN_OPTIONS. Key changes: - Use the define `__SANITIZER_DISABLE_CONTAINER_OVERFLOW__` to disable instrumentation at compile time - Implemented redefining the container overflow APIs in common_interface_defs.h to use define to provide null implementation when define is present - Update documentation in AddressSanitizer.rst to suggest and illustrate use of the define - Add details of the define in PrintContainerOverflowHint() - Add test disable_container_overflow_checks to verify new hints on the error and fill the testing gap that ASAN_OPTIONS=detect_container_overflow=0 works - Add tests demonstrating the issue around closed source libraries and instrumented apps that both modify containers This requires no compiler changes and should be supportable cross compiler toolchains. An RFC has been opened to discuss: https://discourse.llvm.org/t/rfc-add-fsanitize-address-disable-container-overflow-flag-to-addresssanitizer/88349

This patch changes the type of SubRegIndicesSize to size_t. The original type deduced for "auto" is a signed type, but size_t, an unsigned type, is safe here according to the usage.

…get LHS for complex compound assignment (llvm#166798) - Fixes llvm#166512 - `ComplexExprEmitter::EmitCompoundAssignLValue` is calling `EmitLoadOfScalar(LValue, SourceLocation)` to load the LHS value in the case that it's non-complex, however this function requires that the value is a simple LValue - issue occurred because the LValue in question was a bitfield LValue. I changed it to use this function which seems to handle all of the different cases (deferring to the original `EmitLoadOfScalar` if it's a simple LValue)

R_AARCH64_FUNCINIT64 is a dynamic relocation type for relocating word-sized data in the output file using the return value of a function. An R_AARCH64_FUNCINIT64 shall be relocated as an R_AARCH64_IRELATIVE with the target symbol address if the target symbol is non-preemptible, and it shall be a usage error to relocate an R_AARCH64_FUNCINIT64 with a preemptible or STT_GNU_IFUNC target symbol. The initial use case for this relocation type shall be for emitting global variable field initializers for structure protection. With structure protection, the relocation value computation is tied to the compiler implementation in such a way that it would not be reasonable to define a relocation type for it (for example, it may involve computing a hash using a compiler-determined algorithm), hence the need for the computation to be implemented as code in the binary. Part of the AArch64 psABI extension: ARM-software/abi-aa#340 Reviewers: smithp35, fmayer, MaskRay Reviewed By: fmayer Pull Request: llvm#156564

This adds the scalar expression visitor needed to handle default arguments being passed to constructors.

vulkan_sdk_setup is the name of the method that configures it, but the repo itself has the name vulkan_sdk This was caught by enabling the bzlmod flag for CI. The GH action runs `blaze test @llvm-project/...` but the target is tagged manual, so it's excluded. The buildkite CI runs `bazel query | xargs bazel test` which will include manual targets.

Clean up some of the existing predicated load/store sink/hosting tests and add additional test coverage for more complex cases.

Given a set of pointers, check if they can be rearranged as follows (%s is a constant): %b + 0 * %s + 0 %b + 0 * %s + 1 %b + 0 * %s + 2 ... %b + 0 * %s + w %b + 1 * %s + 0 %b + 1 * %s + 1 %b + 1 * %s + 2 ... %b + 1 * %s + w ... If the pointers can be rearanged in the above pattern, it means that the memory can be accessed with a strided loads of width `w` and stride `%s`.

…llvm#168026) The previous implementation of `combineOp_VLToVWOp_VL` manually replaced old nodes with newly created widened nodes, but only added the new node itself to the `DAGCombiner` worklist. Since the users of the new node were not added, some combine opportunities could be missed when external `DAGCombiner` passes expected those users to be reconsidered. This patch replaces the custom replacement logic with a call to `DCI.CombineTo()`, which performs node replacement in a way consistent with `DAGCombiner::Run`: - Replace all uses of the old node. - Add the new node and its users to the worklist. - Clean up unused nodes when appropriate. Using `CombineTo` ensures that `combineOp_VLToVWOp_VL` behaves consistently with the standard `DAGCombiner` update model, avoiding discrepancies between the private worklist inside this routine and the global worklist managed by the combiner. This resolves missed combine cases involving VL -> VW operator widening. --------- Co-authored-by: Kai Lin <omg_link@qq.com>

…m#168642) This is not needed now that we are using the container to run the workflow.

It doesn't look like these tests are actually run on Windows, so we don't need it.

We don't need this anymore since all new contributors in the last year have applied for commit access using GitHub issues. There is already code in the script that removes anyone who submitted a request, so we don't need the old code any more (which was way too conservitave and very slow).

…7815) To prepare for more backends to use Mustache templates, this patch lifts the Mustache utilities from `HTMLMustacheGenerator.cpp` to `Generators.h`. A MustacheGenerator interface is created to share code for template creation.

The collection of library function names in TargetLibraryInfo faces similar challenges as RuntimeLibCalls in the IR component. The number of function names is large, there are numerous customizations based on the triple (including alternate names), and there is a lot of replicated data in the signature table. The ultimate goal would be to capture all lbrary function related information in a .td file. This PR brings the current .def file to TableGen, almost as a 1:1 replacement. However, there are some improvements which are not possible in the current implementation: - the function names are now stored as a long string together with an offset table. - the table of signatures is now deduplicated, using an offset table for access. The size of the object file decreases about 34kB with these changes. The hash table of all function names is still constructed dynamically. A static table like for RuntimeLibCalls is the next logical step. The main motivation for this change is that I have to add a large number of custom names for z/OS (like in RuntimeLibCalls.td), and the current infrastructur does not support this very well.

) Using the builtin failed on 32-bit architectures: ``` clang/lib/AST/ExprConstant.cpp:14299: [..]: Assertion `I.getBitWidth() == Info.Ctx.getIntWidth(E->getType()) && "Invalid evaluation result."' failed. ``` The return type is meant to be size_t. Fix it.

I found that the issues we've been seeing in the HTML whitespace/alignment are due to partials inserting their own whitespace and calling partials on indented lines or lines containing text already. This patch gets rid of unnecessary whitespace in the comment and function partials so that they are properly indented when inserted.

The test was added in llvm#147252. On a 32-bit target, it fails with error: ``` File "...\TestDataFormatterLibcxxInvalidString.py", line 23, in test self.skip() ^^^^^^^^^ AttributeError: 'LibcxxInvalidStringDataFormatterTestCase' object has no attribute 'skip' ```

Summary: We start this thread if the RPC client symbol is detected in the loaded binary. We should make this sleep if there's no work to avoid the thread running at high priority when the (scarecely used) RPC call is actually required. So, right now after 25 microseconds we will assume the server is inactive and begin sleeping. This resets once we do find work. AMD supports a more intelligent way to do this. HSA signals can wake a sleeping thread from the kernel, and signals can be sent from the GPU side. This would be nice to have and I'm planning on working with it in the future to make this infrastructure more usable with existing AMD workloads.

This PR upstreams `cir.await` and adds initial codegen for emitting a skeleton of the ready, suspend, and resume branches. Codegen for these branches is left for a future PR. It also adds a test for the invalid case where a `cir.func` is marked as a coroutine but does not contain a `cir.await` op in its body.

Replace retrieving FMFs for in-loop reduction via underlying instruction + legal by collecting the flags during reduction chain traversal in VPlan.

It appears that this broke the build by not using the 'correct' name for the expression. This is probably something that crossed in review.

In my last patch, it became clear during code review that the postfix operation was actually a read THEN update, not update/read like other single line versions. It wasn't clear at the time how much additional work this would be to make postfix work correctly (and they are a bit of a 'special' thing in codegen anyway), so this patch adds some functionality to sense this and special-cases it when generating the statement info for capture.

…2 to deinterleave3-8. (llvm#168640)

This implements null base class initialization for non-empty bases.

Use `IfGuardEmitter` in CallingConvEmitter. Additionally refactor the code a bit to extract duplicated code to emit the CC function prototype into a helper function.

…#168783) I saw this while doing lowering, we were not properly looking into the array sections for the variable. Presumably we didn't do a good job of making sure we did this right when making this extension, and missed this spot.

…ypes (llvm#163465) This change adds resource handle type `__hlsl_resource_t` to the list of types recognized in the Clang's built-in functions prototype string. HLSL has built-in resource classes and some of them have many methods, such as [Texture2D](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-texture2d). Most of these methods will be implemented by built-in functions that will take resource handle as an argument. This change enables us to move from generic `void(...)` prototype string for these methods and explicit argument checking in `SemaHLSL.cpp` to a prototype string with explicit argument types. Argument checking in `SemaHLSL.cpp` can be reduced to handle just the rules that cannot be expressed in the prototype string (for example verifying that the offset value in `__builtin_hlsl_buffer_update_counter` is `1` or `-1`). In order to make this work, we now allow conversions from attributed resource handle type such as `__hlsl_resource_t [[hlsl::resource_class(UAV)]] [[hlsl::contained_type(float)]]` to a plain non-attributed `__hlsl_resource_t` type.

…68382) As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark `_BitInt`s as "reconstitutable" when compiling with `-gsimple-template-names`. However, before doing so we need to make sure the `DWARFTypePrinter` can reconstruct template parameter values that have `_BitInt` type. This patch adds support for printing `DW_TAG_template_value_parameter`s that have `_BitInt` type. Since `-gsimple-template-names` only omits template parameters that are `<= 64` bit wide, we don't support `_BitInt`s larger than 64 bits.

llvm#168762) I'm not aware of any way for `%run` wrapper scripts like `iosssim_run.py` ([ref](https://github.com/llvm/llvm-project/blob/d2c7c6064259320def7a74e111079725958697d4/compiler-rt/test/sanitizer_common/ios_commands/iossim_run.py#L4)) to know what testcase they are currently running. This can be useful if these wrappers need to create a (potentially remote) temporary directory for each test case. This adds the `LIT_CURRENT_TESTCASE` environment variable to both the internal shell and the external shell, containing the full name of the current test being run.

…cialCaseList.cpp llvm#168088 (llvm#168779) This test has long call chain in recursion. Search tree can be pruned early by swapping CC test and recursive simplifyAssumingCCVal. Fixes: llvm#168088 Co-authored-by: anoopkg6 <anoopkg6@github.com>

Another miss when working through 'link', we didn't properly handle giving the whole array-section expression or array index expression, instead allowed it to only get the decl-ref-expr. This patch makes sure we don't add the wrong thing.

…simple-template-names (llvm#168383) Depends on: * llvm#168382 As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark `_BitInt`s as "reconstitutable" when compiling with `-gsimple-template-names`. We still only omit template parameters that are `<= 64` bit wide. So support `_BitInt`s larger than 64 bits is not part of this patch.

… intrinsics. (llvm#168668) We can constant fold interleave of identical splat vectors to a larger splat vector.

…include` (llvm#168196) Closes [llvm#166938](llvm#166938)

Ref commit in incubator: ee17ff6 There is a minor change in the assumption for emitting a direct callee. In incubator, `bool hasAttributeNoBuiltin = false` (`llvm-project/clang/lib/CIR/CodeGen/CIRGenExpr.cpp:1671`), while in upstream, it's true, therefore, the call to finite(...) is not converted to a builtin anymore. Fixes llvm#163892

z1-cciauto · 2025-11-19T23:55:37Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2886

rampitec and others added 30 commits November 19, 2025 10:24

[AMDGPU] Add baseline test to show spilling of wmma scale. NFC (llvm#…

ddbdc9a

…168163) This is to show the spilling of WMMA scale values which are limited to low 256 VGPRs. We have free registers, just RA allocates low 256 first.

[MLIR][NVVM] Doc fixes (llvm#168716)

98b1708

[gn] port 2675dcd (lldb-server PlatformOptions.inc)

1f34550

[gn] port 0ae2bcc (arm SDNodeInfo)

bc5f3d2

[gn] port e47e9f3 (nvptx SDNodeInfo)

3adcfd2

[InstSimplify] Add whitespace to struct declarations in vector-calls.…

d2c7c60

…ll. NFC This matches how IR is printed.

Minor fix of reproducer in llvm#165572 (llvm#168751)

1233c4b

[TableGen] Use size_t for SubRegIndicesSize (NFC) (llvm#168728)

fc95558

This patch changes the type of SubRegIndicesSize to size_t. The original type deduced for "auto" is a signed type, but size_t, an unsigned type, is safe here according to the usage.

[gn] port 0ae2bcc more

06f0d30

[gn build] Port 8fce476

fb8155c

[CIR] Handle default arguments in ctors (llvm#168649)

5611268

This adds the scalar expression visitor needed to handle default arguments being passed to constructors.

[LV] Simplify existing load/store sink/hoisting tests, extend coverage.

e148d2d

Clean up some of the existing predicated load/store sink/hosting tests and add additional test coverage for more complex cases.

workflows/libclang-abi-tests: Remove use of install-ninja action (llv…

e00314b

…m#168642) This is not needed now that we are using the container to run the workflow.

workflows/hlsl-test-all: Drop use of setup-windows action (llvm#167437)

2f9f492

It doesn't look like these tests are actually run on Windows, so we don't need it.

[CIR] Upstream CIR codegen for mxcsr x86 builtins (llvm#167948)

f859427

[RISCV] Fix CFI Multiple Locations Test (llvm#168772)

2b16ae0

DAG: Use poison for some vector result widening (llvm#168290)

253ed52

jhuber6 and others added 19 commits November 19, 2025 15:56

[VPlan] Collect FMFs for in-loop reduction chain in VPlan. (NFC)

040d9c9

Replace retrieving FMFs for in-loop reduction via underlying instruction + legal by collecting the flags during reduction chain traversal in VPlan.

Fix build breakage from: llvm#167948 (llvm#168781)

3b49c92

It appears that this broke the build by not using the 'correct' name for the expression. This is probably something that crossed in review.

[ConstantFolding] Generalize constant folding for vector_deinterleave…

90ea49a

…2 to deinterleave3-8. (llvm#168640)

[CIR] Handle non-empty null base class initialization (llvm#168646)

3f55f8b

This implements null base class initialization for non-empty bases.

[NFC][TableGen] Use IfGuardEmitter in CallingConvEmitter (llvm#168763)

308185e

Use `IfGuardEmitter` in CallingConvEmitter. Additionally refactor the code a bit to extract duplicated code to emit the CC function prototype into a helper function.

[ConstantFolding] Add constant folding for scalable vector interleave…

8830525

… intrinsics. (llvm#168668) We can constant fold interleave of identical splat vectors to a larger splat vector.

[clang-tidy] Add IgnoredFilesList option to `readability-duplicate-…

2aa2290

…include` (llvm#168196) Closes [llvm#166938](llvm#166938)

merge main into amd-staging

6a3d6db

z1-cciauto requested a review from a team November 19, 2025 23:54

ronlieb approved these changes Nov 20, 2025

View reviewed changes

z1-cciauto merged commit 0d7728c into amd-staging Nov 20, 2025
10 checks passed

z1-cciauto deleted the upstream_merge_202511191854 branch November 20, 2025 02:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #629

merge main into amd-staging #629

Uh oh!

z1-cciauto commented Nov 19, 2025

Uh oh!

z1-cciauto commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

36 participants

merge main into amd-staging #629

merge main into amd-staging #629

Uh oh!

Conversation

z1-cciauto commented Nov 19, 2025

Uh oh!

z1-cciauto commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

36 participants