forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
z1-cciauto
merged 100 commits into
amd-staging
from
amd/merge/upstream_merge_20251111061038
Nov 11, 2025
Merged
merge main into amd-staging #562
z1-cciauto
merged 100 commits into
amd-staging
from
amd/merge/upstream_merge_20251111061038
Nov 11, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…w clangOptions library" (llvm#167374) This relands llvm#167348. The original PR was reverted due to a reported build failure, which was later diagnosed as a local issue in the developer’s checkout or build state. See discussion here: llvm#163659 (comment) No additional changes have been made in this reland.
This isn't reachable today but will come into play once we reorder passes for llvm#147352 and llvm#147351. Note that the `CBufferRowIntrin` helper struct is copied from the `DXILCBufferAccess` pass, but it will be removed from there when we simplify that pass in llvm#147351
trans bf16 insts do not support clamp and omod
…m#167276) Preparing to moving most of implementation out of the header file. * llvm#167280 --------- Co-authored-by: Naveen Seth Hanig <naveen.hanig@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…lvm#167383) Re-land: Add test for Complex imag literal GNU extension after updating the name
…#152830) This adds the codegen support for the dyn_groupprivate clause.
It is undefined behavior to call a function with a mismatched calling convention. Rather than crash on this behavior, it should compile. This LLVM defect was identified via the AMD Fuzzing project.
…166658) (llvm#166846) The previous implementation in llvm#166252 (rolled back in llvm#166658) caused buildbot failures due to a bug in an added test. The modified `UseAfterClose` did not pass a `struct flock` value to `fcntl`: https://github.com/llvm/llvm-project/blob/2d5170594147b42a37698760d6e0194eec4f1390/libc/test/src/fcntl/fcntl_test.cpp#L175 Which ASAN caught and errored in the `fcntl` implementation when the unspecified argument was accessed: https://github.com/llvm/llvm-project/blob/c12cb2892c808af459eaa270b8738a2b375ecc9b/libc/src/__support/OSUtil/linux/fcntl.cpp#L59
Fixes regressions with llvm#159493 after 476a6ea
Use getDefiningRecipe to future-proof the code. Split off from llvm#156262 as suggested.
This commit introduces `SpecialCaseList::Match`, a small struct to hold the matched rule and its line number. This simplifies the `match` methods by allowing them to return a single value instead of using a callback. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Make the systemSupportsMemoryTagging() function return even on system that don't support memory tagging. This avoids the need to always check if memory tagging is supported before calling th function. Make systemSupportsMemoryTagging() cache the getauxval return value instead of calling the function every time. Updated the code that calls systemSupportsMemoryTagging().
This hardens the code to check based on WideMember0's operands. This ensures each call will go through the same check. Should be NFC currently but needed when generalizing in follow-up patches.
…m#153608) This should be the peephole's job. Because and sets V flag to 0, this is why signed comparisons with 0 are okay to replace with tst. Note this is only for AArch64, because ANDS on ARM leaves the V flag the same. Fixes: llvm#154387
… of pattern matching (llvm#166806) Flang alias analysis used to find allocation site by pattern matching allocation ops in mainly FIR dialect. This MR extends the characterization to instead characterize based on whether the result of an op has MemAlloc effect.
After llvm#164165, we emit warnings from non-system headers by default. This change only preserves functionality of `clang-tidy` as it was before the change.
…m#158224) Both conceptually belong to the same subtarget, so it should not be necessary to pass in the context TargetRegisterInfo to any TargetInstrInfo member. Add this reference so those superfluous arguments can be removed. Most targets placed their TargetRegisterInfo as a member in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo, so unify all targets to look the same.
Will help in future patches.
The save stats test would fail downstream due to the tests being run from a read only directory. The cwd flag would attempt to place the statistics in that read only directory, causing an error. This patch ensures that the tests are always run from a temp directory.
llvm#167350) fwrite and friends don't modify errno if no error occurred. Therefore frite_unlocked's return value shouldn't be constructed from errno without checking if an error actually occurred. This fixes an error introduced by llvm@9e2f73f
This is merging loads and stores so use the combined DebugLoc. Not sure if computeBase should be using the merged location from all the involved instructions. I'm also not sure how to test this sort of thing.
… names (llvm#167293) We shouldn't be calling `getUnversionedName` unconditionally because DWARFv6-enabled Clang will emit versioned language names that are versioned (i.e., `sourceLanguageName`...see new test case). We need to check `hasVersionedName` and pick the appropriate API based on that. Otherwise we assert in `getUnversionedName`.
When transforming floating-point induction variables into integer ones, make sure we stay within the bounds of fp values that can be represented as integers without gaps, i.e., 2^24 and 2^53 for IEEE-754 single and double precision respectively (both on negative and positive side). Fixes: llvm#166496.
…6395) So we know before _what_ entry in the chain we need to look for the InitList. Fixes llvm#166171
This means that VPExpressions will now be constructed for VPPartialReductionRecipe's when the loop has tail-folding predication. Note that control-flow (if/else) predication is not yet handled for partial reductions, because of the way partial reductions are recognised and built up.
…nux) (llvm#165451) This patch reimplements the SME ABI `__arm_za_disable` routine within libunwind. This routine must be called before resuming from unwinding on AArch64 platforms with SME support. Before calling the routine, we need to check that SME is available. In this patch, this is implemented for Linux-based platforms by checking HWCAP2. It should be possible to implement this check for other platforms as required. This patch includes a test for this functionality. This test requires SME, so on platforms without it, it will simply pass.
llvm#161635) `aligned_storage` has been deprecated and will most likely be removed in a future version of C++. This patch removes some of its uses to avoid having to work around its removal in the future.
llvm#167255) Patch updates transform.memref.erase_dead_alloc_and_stores to not delete escaped allocations.
The original problem does not reproduce anymore, but add the test case. Fixes llvm#163563
This allows us to strip a special case in VPWidenGEP::execute.
This patch adds cost model tests for `bfloat` operations with `+sve-b16b16`. Currently, some of these costs are higher than they should be as the cost model is assuming `bfloat`s need promotion, but some of these operations are natively supported with `+sve-b16b16`.
After llvm#167359 / 95db31e. A fix was attempted in llvm#167423 but was not quite enough. From what I could understand, in v1 format you have to specify all the basic blocks. Where before !call_me implied they were all cold (I think, very shaky understanding here). For this test we want to see blocks like call_me/foo/call_me. So adding a line for block 1 fixes the tests. It could produce more blocks at some point but I think as long as foo is within two of them, it'll be fine.
… into new clangOptions library" (llvm#167374)" This reverts commit f63d33d.
) Currently the only way to enable the use of wide active lane masks is to pass -enable-wide-lane-mask and force both interleaving & tail-folding with additional flags. This patch changes selectInterleaveCount to consider interleaving if wide lane masks were requested, although the feature remains off by default.
…se the load is hidden behind a bitcast (llvm#167491)
Adds `transform.xegpu.set_gpu_launch_threads` that overrides `gpu.launch` operation threads.
Collaborator
dpalermo
approved these changes
Nov 11, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.