-
Notifications
You must be signed in to change notification settings - Fork 800
LLVM and SPIRV-LLVM-Translator pulldown (WW41 2024) #15669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit cf02d8b.
This reverts commit 2383bc8.
This reverts commit b4a8e87.
…Error (NFC) (#106774)" This reverts commit 06939fa.
…rotocol" This reverts commit a7c1745.
Fix windows test after #108921.
…epCandidate() (#109212) These are helper functions to be used by the vectorizer's dependency graph.
Resolve #94928 This PR adds `if (TD->getTemplateDecl())` to prevent `InnerD` becoming `nullptr`, suggested by @firstmoonlight. I also add `-ast-dump-decl-types` option and declare type `CHECK` to the testcase `clang/test/AST/ast-dump-concepts.cpp`. --------- Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
This patch improves the documentation for JITLink by fixing some typos, correcting indentations and fixing out-dated code examples.
…uild. (#109078)" (#109207) `std::complex` operators do not work for the CUDA device compilation of F18 runtime. This change makes use of `cuda::std::complex` from `libcudacxx`. `cuda::std::complex` does not have specializations for `long double`, so the change is accompanied with a clean-up for `long double` usage. Additional change on top of #109078 is to use `cuda::std::complex` only for the device compilation, otherwise the host compilation fails because `libcudacxx` may not support `long double` specialization at all (depending on the compiler).
…109176) The API is present, and we even have a test for it, but it isn't documented so no one probably knows you can set requirements for your scripted commands. This just adds docs and uses it appropriately in the `framestats` example command.
… is marked Promote. We have a special check that tries to determine if vector FP operations are supported for the type to determine whether to scalarize or not. If FP arithmetic would be promoted, don't unroll. This improves Zvfhmin codegen on RISC-V.
Check that the destination of G_EXTRACT_SUBVECTOR is smaller than the source. Improve wording of error messages.
-Improve messages. -Remove redundant checks that are handled in generic code. -Add check that the subvector is smaller than the vector. -Add checks that subvector is smaller than the vector.
This revision adds vector predication smax, smin, umax and umin intrinsic ops.
…ariable (#109213) This patch adds new runtime entry points that perform the simple allocation/deallocation of module allocatable variable with cuda attributes. When the allocation is initiated on the host, the descriptor on the device is synchronized. Both descriptors point to the same data on the device. This is the first PR of a stack.
… (#109195) Change RegisterBankEmitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Convert `cuf.allocate` and `cuf.deallocate` to the runtime entry points added in #109213 Was reviewed in llvm/llvm-project#109214 but the parent branch was closed for some reason.
Added tests to the validator and fixed issues stemming from the previous skipping over BBs with single successors - which is incorrect. That would be now picked by added tests where the assertions are expected to be triggered.
…ntable callsites (#109184) Reinforcing properties ensured at instrumentation time.
Example: https://lab.llvm.org/buildbot/#/builders/169/builds/3381 The CI allowed the `llvm::append_range` instantiation, but on the other hand it's quite unnecessary here.
The code was passing a physical register directly to getPressureSets which expects a register unit. Fix this by looping over the register units and calling getPressureSets for each of them. Found while trying to add a RegisterUnit class to stop storing register units in `Register`. 0 is a valid register unit but not a valid Register.
Change variable name `o` to `OS` to match definition, and `ClName` to `ClassName` for better clarity. Cache RegBank reference in the class and do no pass around class members to functions.
…r (#108094) Make sure there is no data transfer generated when a device variable is used in these intrinsic functions.
…er (#109194) Change PseudoLoweringEmitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
…109189) Change InstrInfoEmitter to use const RecordKeeper. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
…8663) macOS 10.15 added a "full" x86_64 GPR thread state flavor, equivalent to the normal one but with DS, ES, SS, and GSbase added. This flavor can only be used with processes that install a custom LDT (functionality that was also added in 10.15 and is used by apps like Wine to execute 32-bit code). Along with allowing DS, ES, SS, and GSbase to be viewed/modified, using the full flavor is necessary when debugging a thread executing 32-bit code. If thread_set_state() is used with the regular thread state flavor, the kernel resets CS to the 64-bit code segment (see [set_thread_state64()](https://github.com/apple-oss-distributions/xnu/blob/94d3b452840153a99b38a3a9659680b2a006908e/osfmk/i386/pcb.c#L723), which makes debugging impossible. There's no way to detect whether the full flavor is available, try to use it and fall back to the regular one if it's not available. A downside is that this patch exposes the DS, ES, SS, and GSbase registers for all x86_64 processes, even though they are not populated unless the full thread state is available. I'm not sure if there's a way to tell LLDB that a register is unavailable. The classic GDB `g` command [allows returning `x`](https://sourceware.org/gdb/current/onlinedocs/gdb.html/Packets.html#Packets) to denote unavailable registers, but it seems like the debug server uses newer commands like `jThreadsInfo` and I'm not sure if those have the same support. Fixes #57591 (also filed as Apple FB11464104) @jasonmolenda
|
This is ready for review.
|
Before b7b28e7, AreSupportedUsers will skip MemTransferInst, it may cause unexpected assertion. https://godbolt.org/z/z5d691fj1 In b7b28e7, we start to allow MemTransferInst, we should allow it in adjustByValArgAlignment too. (cherry picked from commit 0bbdc76)
457b79d to
6f4c075
Compare
llvm-spirv/test/extensions/INTEL/SPV_INTEL_function_pointers/CodeSectionINTEL/alias.ll
Show resolved
Hide resolved
|
@intel/llvm-gatekeepers Please help to issue a /merge. The dev ci and AMD failures are irrelevant, also failing on other PRs. |
|
/merge |
|
Thu 17 Oct 2024 04:10:01 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
|
Thu 17 Oct 2024 04:14:51 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later. |
LLVM: llvm/llvm-project@2f50b28
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@d3e72db7e0d74f4