forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 4
[AutoBump] Merge with fixes of 72086490 (Dec 04) (20) [Only tested MLIR] #490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR adds all the missing semantics for the Linear clause based on the OpenMP 5.2 restrictions. The restriction details are mentioned below. OpenMP 5.2: 5.4.6 linear Clause restrictions - A linear-modifier may be specified as ref or uval only on a declare simd directive. - If linear-modifier is not ref, all list items must be of type integer. - If linear-modifier is ref or uval, all list items must be dummy arguments without the VALUE attribute. - List items must not be Cray pointers or variables that have the POINTER attribute. Cray pointer support has been deprecated. - If linear-modifier is ref, list items must be polymorphic variables, assumed-shape arrays, or variables with the ALLOCATABLE attribute. - A common block name must not appear in a linear clause. - The list-item cannot appear more than once 4.4.4 ordered Clause restriction - If n is explicitly specified, a linear clause must not be specified on the same directive. 5.11 aligned Clause restriction - Each list item must have C_PTR or Cray pointer type or have the POINTER or ALLOCATABLE attribute. Cray pointer support has been deprecated.
…m#119004) fd3907c introduced a check for system offload-tblgen executable when doing a standalone build. This check is bogus, since offload-tblgen is built as part of offload and not some other preinstalled component. The path is also overwritten below, so the check only causes tests to be disabled unnecessarily.
This reverts commit b5bd192. It brokes multiple llvm bots including clang-x64-windows-msvc
…d regs (llvm#115756) This fixes https://discourse.llvm.org/t/fixed-register-being-spill-and-restored-in-clang/83058. We need to do it in `MachineRegisterInfo::getCalleeSavedRegs` instead of `RISCVRegisterInfo::getCalleeSavedRegs` since the MF argument of `TargetRegisterInfo:::getCalleeSavedRegs` is `const`, so we can't call `MF->getRegInfo().disableCalleeSavedRegister` there. So to put it in `MachineRegisterInfo::getCalleeSavedRegs`, we move `isRegisterReservedByUser` into `TargetSubtargetInfo`.
Upcoming changes will improve codegen in these cases per the included TOOOs.
This allows shared libraries instrumented with RTSan to be initialized. This approach directly mirrors the approach in Tsan, Asan and many of the other sanitizers
…lvm#112219) This adds support for these instructions and also tests getOperandInfo for these instructions as well. I think the VL on the using add instruction can be optimized further, once we add support for optimizing non-vlmax.
This patch adds a helper function to replace an idiom like: CallStackId CSId = hashCallStack(CallStack) MemProfData.CallStacks.try_emplace(CSId, CallStack); // Do something with CSId.
…edByteAddressBuffer definitions to HLSLExternalSemaSource llvm#113477 (llvm#116699) This is the first one in a series of PRs adding the requirements for llvm#58654 This PR adds `ByteAddressBuffer`, `RWByteAddressBuffer ` and `RasterizerOrderedByteAddressBuffer ` definitions as well as their handle lowering to `dx.RawBuffer`. closes llvm#58654 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>
…#118999) We were still using the old `defined(_LIBCPP_HAS_THREAD_API_PTHREAD)` check, which is always true.
…r outlining kernels. (llvm#118861) Adding optional attributes so we can specify the kernel function names and the kernel module names generated.
The compiler crashes with an ICE when it tries to create a `memset` with scalable size.
…llvm#117915) This patch refactors the tests around aligned allocation and sized deallocation to avoid relying on passing the -fsized-deallocation or -faligned-allocation flags by default. Since both of these features are enabled by default in >= C++14 mode, it now makes sense to make that assumption in the test suite. A notable exception is MinGW and some older compilers, where sized deallocation is still not enabled by default. We treat that as a "bug" in the test suite and we work around it by explicitly adding -fsized-deallocation, but only under those configurations.
…#119018) Summary: Adds support for scoped fences now that the NVPTX backend doesn't break on them.
Reland llvm#116109. Fixes issue where operands were flipped. Per the PTX spec, a mov instruction packs the first operand as low, and the second operand as high: > ``` > // pack two 16-bit elements into .b32 > d = a.x | (a.y << 16) > ``` On the other hand cvt.rn.f16x2.f32 instructions take high, than low operands: > For .f16x2 and .bf16x2 instruction type, two inputs a and b of .f32 type are converted into .f16 or .bf16 type and the converted values are packed in the destination register d, such that the value converted from input a is stored in the upper half of d and the value converted from input b is stored in the lower half of d
This split off changes for more complex CFGs in VPlan from both
llvm#114292
llvm#112138
This simplifies their respective diffs.
…buildvector fully matched If the perfect diamond match was detected for the postponed buildvectors and the vector for the previous node comes after the current node, need to move the vector register before the current inserting point to prevent compiler crash. Fixes llvm#119002
…m#101259) The handling of libc++ and other runtime libraries in the baremetal driver is different from other targets for no particular reason. This change removes the custom in the baremetal driver logic and replaces it with the generic logic to improve consistency and reduce maintenance overhead while also handling additional flags the current logic doesn't.
The Swift plugin would find this useful.
…m#116642) Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ.
This patch fixes: llvm/lib/Target/SystemZ/SystemZISelLowering.cpp:953:30: error: unused variable 'TRI' [-Werror,-Wunused-variable]
Module split assumes that a kernel function must have an external linkage; however, that isn't the case. For example, a static kernel function will have a weak_odr linkage Change-Id: I1e5dee0de1fd866b365f4090a574e1b2961f8dca
…lvm#117419) Fixes llvm#76426, llvm#109778 (for AArch64) The previous patch for this issue, llvm#94271, generated an error message if a register and a global variable did not have the same size. This patch checks if the register is reserved.
…s used (llvm#117419)" This reverts commit 8fc6fca.
…lldb/examples (llvm#113398) This PR adds a proof-of-concept for a bytecode designed to ship and run LLDB data formatters. More motivation and context can be found in the `formatter-bytecode.md` file and on discourse. https://discourse.llvm.org/t/a-bytecode-for-lldb-data-formatters/82696
We'd like to build runtimes using FatLTO (see https://llvm.org/docs/FatLTO.html for details). This gives us more control over how libc++ can be consumed by users of our toolchain, like the Fuchsia SDK.
This patch uses namespace scopes to remove memprof:: in MemProfUseTest.cpp as in MemProfTest.cpp. While I am at it, this patch removes a stale comment about IndexedAllocationInfo::CallStack, which has been removed.
…118948) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.
When VF has a fixed width and equals the number of iterations, and we are not tail folding by masking, comparison instruction and induction operation will be DCEed later. Ignoring the costs of these instructions improves the cost model.
This patch fixes: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2699:49: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
After 28bba0d, LLVM requires a minimum of MSVC 16.8, so update our flag to follow suit.
…m#118814) Introduces a `member` property to `SBValue`. This property provides pythonic access to a value's members, by name. The expression `value.member["name"]` will be an alternate form form of writing `value.GetChildMemberWithName("name")`.
The a0c4f85 change replaced the local int_to_b36_char function returning `char` with uses of the __support function of the same name that returns `int`. The uses of the old local function lacked the casts that all other uses of the shared function of the same name had. Add them.
) For whatever reason, each ctype test contains its own copy of some identical helper source code. These local helpers were defined with external linkage for no apparent reason. This leads to multiple definition errors when linking these tests together. This change moves each file's local helper code into an anonymous namespace so it has internal linkage. It's notable that the libc test code does not follow the most common norm of gtest-style code where all the `TEST(...)` cases themselves are defined inside an anonymous namespace (along with whatever other local helpers they use); whether libc's tests should follow that usual convention can be addressed holistically in future discussion. The replacement of numerous cut&paste'd copies of identical helper code with sharing the source code in some usual fashion is also left for later cleanup. This change only makes the test code not straightforwardly have multiple definition errors that prevent linking a test executable at all.
This was added in llvm#117573 but the options were not being rendered correctly due to the missing newline after `::`.
…lvm#119252) Reverts llvm#112277 This broke something on Fuchsia's Mac builders, so there's still something in the CMake that needs to be updated before we reland. Failed build: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-mac-xarm64/b8729005878443108801/overview
…llvm#117195)" (llvm#119247) The previous patch https://github.com/llvm/llvm-project/pull/116860/files#diff-e7e06355c973f68f900d2a34a4103dbfa022589c55c59d02870da9365acf7b98L651 seems to mistakenly overwrites true16 test lines. i.e. ``` v_fmaak_f16 v5.l, v1.l, v2.l, 0xfe0b ``` to ``` v_fmaak_f16 v5, v1, v2, 0xfe0b ``` Planned to revert patch llvm#117195 llvm#116860 and redo these two. This is the revert of the patch 117195. The revert of 116860 will be in a seperate patch
Approximates the shadow propagation via OR'ing. Updates the neon_vmul.ll test introduced in llvm#117935
llvm#119253) The previous patch https://github.com/llvm/llvm-project/pull/116860/files#diff-e7e06355c973f68f900d2a34a4103dbfa022589c55c59d02870da9365acf7b98L651 seems to mistakenly overwrites true16 test lines. i.e. v_fmaak_f16 v5.l, v1.l, v2.l, 0xfe0b to v_fmaak_f16 v5, v1, v2, 0xfe0b Planned to revert patch llvm#117195 llvm#116860 and redo these two. This is the revert of the patch 116860.
…m#119202) This PR improves general validity of emitted code between passes due to generation of `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after Instruction Selection, fixing generation of OpTypePointer instructions and using of proper virtual register classes. Using `TargetOpcode::PHI` instead of `SPIRV::OpPhi` after Instruction Selection has a benefit to support existing optimization passes immediately, as an alternative path to disable those passes that use `MI.isPHI()`. This PR makes it possible thus to revert llvm#116060 actions and get back to use the `MachineSink` pass. This PR is a solution of the problem discussed in details in llvm#110507. It accepts an advice from code reviewers of the PR llvm#110507 to postpone generation of OpPhi rather than to patch CodeGen. This solution allows to unblock improvements wrt. expensive checks and makes it unrelated to the general points of the discussion about OpPhi vs. G_PHI/PHI. This PR contains numerous small patches of emitted code validity that allows to substantially pass rate with expensive checks. Namely, the test suite with expensive checks set ON now has only 12 fails out of 569 total test cases. FYI @bogner
FWICT, these were the newly added headers for c11.
The functions are not relevant for most sanitizers and only required for MSan to see which regions have been written to. This eliminates a link dependency for all other sanitizers and fixes llvm#59007: while `-lresolv` had been added for the static runtime in 6dce56b, it wasn't added to the shared runtimes. Instead of just moving the interceptors, we adapt them to MSan conventions: * We don't skip intercepting when `msan_init_is_running` is true, but directly call ENSURE_MSAN_INITED() like most other interceptors. It seems unlikely that these functions are called during initialization. * We don't unpoison `errno`, because none of the functions is specified to use it.
We do not have CI coverage for Windows/MacOS and we regularly run into problem where changes break post-commit fullbuild which is not tested in pre-commit builds. This PR utilizes the github action to address such issues.
…nc and gpu.func (llvm#119034) Use `pm.nest` to schedule the pass on nested `func.func` and `gpu.func` in the `gpu.module`. AbstractResult pass is not meant to run on the whole gpu.module at once.
[AutoBump] Merge with 1d4b5c1 (Dec 09) (21)[Only tested MLIR]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.