-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SYCL atomics for local memory #4585
Conversation
Also, see intel/llvm#4669 (comment). We would likely replace the checks here with |
@masterleinad Wow, that was fast. Yes, it did fix local atomics :) |
Now, we just need to find a way to make that work for |
fd2bc03
to
bfc6159
Compare
I decided, for now, to only fix the behavior for Intel GPUs. I'm happy to create a corresponding pull request on the |
sycl::access::address_space::local_space> \ | ||
dest_ref(*dest); \ | ||
return dest_ref.fetch_##OPER(val); \ | ||
} else { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it not
auto g = __SYCL_GenericCastToPtrExplicit_ToGlobal<TYPE>(dest);
if (g) { ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want both checks?
For one check, I choose the space to test pretty arbitrarily. Do you have any arguments for rather checking for the global space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I was just wondering whether we should check that the pointer is indeed global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel overly strong about this. As of know, there is https://github.com/intel/llvm/blob/a3e9aab6a25a9fc69e805f1a93f409375ce1f7d1/sycl/include/CL/sycl/atomic_ref.hpp#L42-L49 but also https://github.com/intel/llvm/blob/a3e9aab6a25a9fc69e805f1a93f409375ce1f7d1/sycl/include/CL/sycl/atomic_ref.hpp#L116-L120. All that is to say that really only global_space
and local_space
are supported (and we would get rid of all of this as soon as generic_space
is available).
At this point, I slightly prefer not to have another check for every call SYCL atomics. If you (or others) prefer we can add the check and adopt https://github.com/intel/llvm/blob/a3e9aab6a25a9fc69e805f1a93f409375ce1f7d1/sycl/include/sycl/ext/oneapi/sub_group.hpp#L442-L443.
Retest this please. |
1 similar comment
Retest this please. |
82e87e7
to
d3dacda
Compare
Please do so |
See desul/desul#48. |
Fixes #4582. @brian-kelley Can you please check if this fixes the local atomics issue for you?