New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][PI][CUDA] Update queries for atomic order and scope for CUDA #4853
Conversation
…abilities for CUDA devices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, though I have some small comments while this is blocked anyway.
Could you please add a case for the new descriptors in the other PI plugins, like with PI_DEVICE_INFO_ATOMIC_MEMORY_ORDER_CAPABILITIES
? I don't expect full implementations, but for consistency it helps keep track of which ones aren't implemented in the corresponding plugins.
Also, would you mind changing this to a draft PR and adding "Draft: " or "[WIP]" (I think the former is more visible) to the title? This is just to prevent it from being prematurely merged by mistake.
After reviewing #4820 I do not think having it merged is enough to unblock this PR. The reason is that, even though it introduces atomic operations with additional memory scopes (acq_rel, acquire, and release), these are still not supported by atomic load/store in LLVM's NVPTX implementation. For more context see llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp#L859 (and similar for store). This previously caused libclc to fail to build kernels that would even remotely consider using atomic load/store with anything stricter than "unordered" memory order (see 4876443.) |
That should be done now. |
@t4c1, please, update ABI tests. |
The ABI tests should be fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to start testing.
Dismiss approve to avoid unintentional merge as this PR is labeled as a draft.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for adding this!
@t4c1, please, resolve merge conflicts. |
# Conflicts: # sycl/include/CL/sycl/info/info_desc.hpp
Given #4853 (comment), I would say no. In the meantime I also figured there are some atomics without direct ptx equivalents missing and will add them soon. |
All the PRs blocking this have now been merged. |
Does this need to wait for something else or can it be merged? |
@againull, @s-kanaev or @smaslov-intel are expected to approve Level Zero plug-in changes. Folks, could you take a look, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plugin changes look good to me.
@t4c1, this change broke two tests in llvm-test-suite. SYCL :: AtomicRef/atomic_memory_order_acq_rel.cpp Error message:
Could you take a look please? |
This change just let these tests run for CUDA. What is broken is all the
and checking if the whole sequence of operations is atomic. Which it is not - only each of the operations (load/store) on its own is atomic. I am not sure what these tests are supposed to check ... maybe we can just remove them? |
Also, as far as I know, no backend supported acquire release (or sequentially consistent) order before this PR was merged, so these tests were never actually run. |
Could you create a patch to llvm-test-suite with removing illegal checks and add @steffenlarsen to discuss this change, please? |
Removes `atomic_memory_order*` tests, which are broken. They are using the pattern: ``` auto ld = aar.load(); ld += 1; aar.store(ld); ``` and checking if the whole sequence of operations is atomic. Which it is not - only each of the operations (load/store) on its own is atomic. Before intel/llvm#4853 was merged no backend supported acquire release or sequentially consistent memory orders, so these tests were never run before. This issue was first discussed here: intel/llvm#4853 (comment)
Removes `atomic_memory_order*` tests, which are broken. They are using the pattern: ``` auto ld = aar.load(); ld += 1; aar.store(ld); ``` and checking if the whole sequence of operations is atomic. Which it is not - only each of the operations (load/store) on its own is atomic. Before #4853 was merged no backend supported acquire release or sequentially consistent memory orders, so these tests were never run before. This issue was first discussed here: #4853 (comment)
…e#783) Removes `atomic_memory_order*` tests, which are broken. They are using the pattern: ``` auto ld = aar.load(); ld += 1; aar.store(ld); ``` and checking if the whole sequence of operations is atomic. Which it is not - only each of the operations (load/store) on its own is atomic. Before intel#4853 was merged no backend supported acquire release or sequentially consistent memory orders, so these tests were never run before. This issue was first discussed here: intel#4853 (comment)
Updates returns for atomics memory order and scope capabilities queries to make them in line with changes in #4820.
This includes adding the previously not existing option to query for atomic scope capabilities.