forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…168163) This is to show the spilling of WMMA scale values which are limited to low 256 VGPRs. We have free registers, just RA allocates low 256 first.
…ll. NFC This matches how IR is printed.
Currently LibcallLoweringInfo is defined inside of TargetLowering, which is owned by the subtarget. Pass in the subtarget so we can construct LibcallLoweringInfo with the subtarget. This is a temporary step that should be revertable in the future, after LibcallLoweringInfo is moved out of TargetLowering.
…e time. (llvm#163468) Document a define to allow library developers to support disabling AddressSanitizer's container overflow detection in template code at compile time. The primary motivation is to reduce false positives in environments where libraries and frameworks that cannot be recompiled with sanitizers enabled are called from application code. This supports disabling checks when the runtime environment cannot be reliably controlled to use ASAN_OPTIONS. Key changes: - Use the define `__SANITIZER_DISABLE_CONTAINER_OVERFLOW__` to disable instrumentation at compile time - Implemented redefining the container overflow APIs in common_interface_defs.h to use define to provide null implementation when define is present - Update documentation in AddressSanitizer.rst to suggest and illustrate use of the define - Add details of the define in PrintContainerOverflowHint() - Add test disable_container_overflow_checks to verify new hints on the error and fill the testing gap that ASAN_OPTIONS=detect_container_overflow=0 works - Add tests demonstrating the issue around closed source libraries and instrumented apps that both modify containers This requires no compiler changes and should be supportable cross compiler toolchains. An RFC has been opened to discuss: https://discourse.llvm.org/t/rfc-add-fsanitize-address-disable-container-overflow-flag-to-addresssanitizer/88349
This patch changes the type of SubRegIndicesSize to size_t. The original type deduced for "auto" is a signed type, but size_t, an unsigned type, is safe here according to the usage.
…get LHS for complex compound assignment (llvm#166798) - Fixes llvm#166512 - `ComplexExprEmitter::EmitCompoundAssignLValue` is calling `EmitLoadOfScalar(LValue, SourceLocation)` to load the LHS value in the case that it's non-complex, however this function requires that the value is a simple LValue - issue occurred because the LValue in question was a bitfield LValue. I changed it to use this function which seems to handle all of the different cases (deferring to the original `EmitLoadOfScalar` if it's a simple LValue)
R_AARCH64_FUNCINIT64 is a dynamic relocation type for relocating word-sized data in the output file using the return value of a function. An R_AARCH64_FUNCINIT64 shall be relocated as an R_AARCH64_IRELATIVE with the target symbol address if the target symbol is non-preemptible, and it shall be a usage error to relocate an R_AARCH64_FUNCINIT64 with a preemptible or STT_GNU_IFUNC target symbol. The initial use case for this relocation type shall be for emitting global variable field initializers for structure protection. With structure protection, the relocation value computation is tied to the compiler implementation in such a way that it would not be reasonable to define a relocation type for it (for example, it may involve computing a hash using a compiler-determined algorithm), hence the need for the computation to be implemented as code in the binary. Part of the AArch64 psABI extension: ARM-software/abi-aa#340 Reviewers: smithp35, fmayer, MaskRay Reviewed By: fmayer Pull Request: llvm#156564
This adds the scalar expression visitor needed to handle default arguments being passed to constructors.
vulkan_sdk_setup is the name of the method that configures it, but the repo itself has the name vulkan_sdk This was caught by enabling the bzlmod flag for CI. The GH action runs `blaze test @llvm-project/...` but the target is tagged manual, so it's excluded. The buildkite CI runs `bazel query | xargs bazel test` which will include manual targets.
Clean up some of the existing predicated load/store sink/hosting tests and add additional test coverage for more complex cases.
Given a set of pointers, check if they can be rearranged as follows (%s is a constant): %b + 0 * %s + 0 %b + 0 * %s + 1 %b + 0 * %s + 2 ... %b + 0 * %s + w %b + 1 * %s + 0 %b + 1 * %s + 1 %b + 1 * %s + 2 ... %b + 1 * %s + w ... If the pointers can be rearanged in the above pattern, it means that the memory can be accessed with a strided loads of width `w` and stride `%s`.
…llvm#168026) The previous implementation of `combineOp_VLToVWOp_VL` manually replaced old nodes with newly created widened nodes, but only added the new node itself to the `DAGCombiner` worklist. Since the users of the new node were not added, some combine opportunities could be missed when external `DAGCombiner` passes expected those users to be reconsidered. This patch replaces the custom replacement logic with a call to `DCI.CombineTo()`, which performs node replacement in a way consistent with `DAGCombiner::Run`: - Replace all uses of the old node. - Add the new node and its users to the worklist. - Clean up unused nodes when appropriate. Using `CombineTo` ensures that `combineOp_VLToVWOp_VL` behaves consistently with the standard `DAGCombiner` update model, avoiding discrepancies between the private worklist inside this routine and the global worklist managed by the combiner. This resolves missed combine cases involving VL -> VW operator widening. --------- Co-authored-by: Kai Lin <omg_link@qq.com>
…m#168642) This is not needed now that we are using the container to run the workflow.
It doesn't look like these tests are actually run on Windows, so we don't need it.
We don't need this anymore since all new contributors in the last year have applied for commit access using GitHub issues. There is already code in the script that removes anyone who submitted a request, so we don't need the old code any more (which was way too conservitave and very slow).
…7815) To prepare for more backends to use Mustache templates, this patch lifts the Mustache utilities from `HTMLMustacheGenerator.cpp` to `Generators.h`. A MustacheGenerator interface is created to share code for template creation.
The collection of library function names in TargetLibraryInfo faces similar challenges as RuntimeLibCalls in the IR component. The number of function names is large, there are numerous customizations based on the triple (including alternate names), and there is a lot of replicated data in the signature table. The ultimate goal would be to capture all lbrary function related information in a .td file. This PR brings the current .def file to TableGen, almost as a 1:1 replacement. However, there are some improvements which are not possible in the current implementation: - the function names are now stored as a long string together with an offset table. - the table of signatures is now deduplicated, using an offset table for access. The size of the object file decreases about 34kB with these changes. The hash table of all function names is still constructed dynamically. A static table like for RuntimeLibCalls is the next logical step. The main motivation for this change is that I have to add a large number of custom names for z/OS (like in RuntimeLibCalls.td), and the current infrastructur does not support this very well.
I found that the issues we've been seeing in the HTML whitespace/alignment are due to partials inserting their own whitespace and calling partials on indented lines or lines containing text already. This patch gets rid of unnecessary whitespace in the comment and function partials so that they are properly indented when inserted.
The test was added in llvm#147252. On a 32-bit target, it fails with error: ``` File "...\TestDataFormatterLibcxxInvalidString.py", line 23, in test self.skip() ^^^^^^^^^ AttributeError: 'LibcxxInvalidStringDataFormatterTestCase' object has no attribute 'skip' ```
Summary: We start this thread if the RPC client symbol is detected in the loaded binary. We should make this sleep if there's no work to avoid the thread running at high priority when the (scarecely used) RPC call is actually required. So, right now after 25 microseconds we will assume the server is inactive and begin sleeping. This resets once we do find work. AMD supports a more intelligent way to do this. HSA signals can wake a sleeping thread from the kernel, and signals can be sent from the GPU side. This would be nice to have and I'm planning on working with it in the future to make this infrastructure more usable with existing AMD workloads.
This PR upstreams `cir.await` and adds initial codegen for emitting a skeleton of the ready, suspend, and resume branches. Codegen for these branches is left for a future PR. It also adds a test for the invalid case where a `cir.func` is marked as a coroutine but does not contain a `cir.await` op in its body.
Replace retrieving FMFs for in-loop reduction via underlying instruction + legal by collecting the flags during reduction chain traversal in VPlan.
It appears that this broke the build by not using the 'correct' name for the expression. This is probably something that crossed in review.
In my last patch, it became clear during code review that the postfix operation was actually a read THEN update, not update/read like other single line versions. It wasn't clear at the time how much additional work this would be to make postfix work correctly (and they are a bit of a 'special' thing in codegen anyway), so this patch adds some functionality to sense this and special-cases it when generating the statement info for capture.
…2 to deinterleave3-8. (llvm#168640)
This implements null base class initialization for non-empty bases.
Use `IfGuardEmitter` in CallingConvEmitter. Additionally refactor the code a bit to extract duplicated code to emit the CC function prototype into a helper function.
…#168783) I saw this while doing lowering, we were not properly looking into the array sections for the variable. Presumably we didn't do a good job of making sure we did this right when making this extension, and missed this spot.
…ypes (llvm#163465) This change adds resource handle type `__hlsl_resource_t` to the list of types recognized in the Clang's built-in functions prototype string. HLSL has built-in resource classes and some of them have many methods, such as [Texture2D](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-texture2d). Most of these methods will be implemented by built-in functions that will take resource handle as an argument. This change enables us to move from generic `void(...)` prototype string for these methods and explicit argument checking in `SemaHLSL.cpp` to a prototype string with explicit argument types. Argument checking in `SemaHLSL.cpp` can be reduced to handle just the rules that cannot be expressed in the prototype string (for example verifying that the offset value in `__builtin_hlsl_buffer_update_counter` is `1` or `-1`). In order to make this work, we now allow conversions from attributed resource handle type such as `__hlsl_resource_t [[hlsl::resource_class(UAV)]] [[hlsl::contained_type(float)]]` to a plain non-attributed `__hlsl_resource_t` type.
…68382) As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark `_BitInt`s as "reconstitutable" when compiling with `-gsimple-template-names`. However, before doing so we need to make sure the `DWARFTypePrinter` can reconstruct template parameter values that have `_BitInt` type. This patch adds support for printing `DW_TAG_template_value_parameter`s that have `_BitInt` type. Since `-gsimple-template-names` only omits template parameters that are `<= 64` bit wide, we don't support `_BitInt`s larger than 64 bits.
llvm#168762) I'm not aware of any way for `%run` wrapper scripts like `iosssim_run.py` ([ref](https://github.com/llvm/llvm-project/blob/d2c7c6064259320def7a74e111079725958697d4/compiler-rt/test/sanitizer_common/ios_commands/iossim_run.py#L4)) to know what testcase they are currently running. This can be useful if these wrappers need to create a (potentially remote) temporary directory for each test case. This adds the `LIT_CURRENT_TESTCASE` environment variable to both the internal shell and the external shell, containing the full name of the current test being run.
…cialCaseList.cpp llvm#168088 (llvm#168779) This test has long call chain in recursion. Search tree can be pruned early by swapping CC test and recursive simplifyAssumingCCVal. Fixes: llvm#168088 Co-authored-by: anoopkg6 <anoopkg6@github.com>
Another miss when working through 'link', we didn't properly handle giving the whole array-section expression or array index expression, instead allowed it to only get the decl-ref-expr. This patch makes sure we don't add the wrong thing.
…simple-template-names (llvm#168383) Depends on: * llvm#168382 As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark `_BitInt`s as "reconstitutable" when compiling with `-gsimple-template-names`. We still only omit template parameters that are `<= 64` bit wide. So support `_BitInt`s larger than 64 bits is not part of this patch.
… intrinsics. (llvm#168668) We can constant fold interleave of identical splat vectors to a larger splat vector.
Ref commit in incubator: ee17ff6 There is a minor change in the assumption for emitting a direct callee. In incubator, `bool hasAttributeNoBuiltin = false` (`llvm-project/clang/lib/CIR/CodeGen/CIRGenExpr.cpp:1671`), while in upstream, it's true, therefore, the call to finite(...) is not converted to a builtin anymore. Fixes llvm#163892
Collaborator
Author
ronlieb
approved these changes
Nov 20, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.