Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SYCL atomics for local memory #4585

Merged
merged 7 commits into from
Dec 14, 2021

Conversation

masterleinad
Copy link
Contributor

Fixes #4582. @brian-kelley Can you please check if this fixes the local atomics issue for you?

@masterleinad
Copy link
Contributor Author

Also, see intel/llvm#4669 (comment). We would likely replace the checks here with address_space_cast as soon as it's available in the compiler or use generic_space as soon as it's available.
IMHO, passing the address space from the outside is likely neither an option simplicity (sycl::local_ptr, sycl::global_ptr) nor explicitly.

@brian-kelley
Copy link
Contributor

@masterleinad Wow, that was fast. Yes, it did fix local atomics :)

@masterleinad
Copy link
Contributor Author

Now, we just need to find a way to make that work for SYCL+CUDA...

@masterleinad
Copy link
Contributor Author

I decided, for now, to only fix the behavior for Intel GPUs. I'm happy to create a corresponding pull request on the desul side if we are happy with this.

@masterleinad masterleinad marked this pull request as ready for review December 3, 2021 20:49
sycl::access::address_space::local_space> \
dest_ref(*dest); \
return dest_ref.fetch_##OPER(val); \
} else { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it not

auto g = __SYCL_GenericCastToPtrExplicit_ToGlobal<TYPE>(dest);
if (g) { ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want both checks?
For one check, I choose the space to test pretty arbitrarily. Do you have any arguments for rather checking for the global space?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I was just wondering whether we should check that the pointer is indeed global.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel overly strong about this. As of know, there is https://github.com/intel/llvm/blob/a3e9aab6a25a9fc69e805f1a93f409375ce1f7d1/sycl/include/CL/sycl/atomic_ref.hpp#L42-L49 but also https://github.com/intel/llvm/blob/a3e9aab6a25a9fc69e805f1a93f409375ce1f7d1/sycl/include/CL/sycl/atomic_ref.hpp#L116-L120. All that is to say that really only global_space and local_space are supported (and we would get rid of all of this as soon as generic_space is available).
At this point, I slightly prefer not to have another check for every call SYCL atomics. If you (or others) prefer we can add the check and adopt https://github.com/intel/llvm/blob/a3e9aab6a25a9fc69e805f1a93f409375ce1f7d1/sycl/include/sycl/ext/oneapi/sub_group.hpp#L442-L443.

core/src/desul/atomics/SYCL.hpp Outdated Show resolved Hide resolved
@masterleinad
Copy link
Contributor Author

Retest this please.

1 similar comment
@masterleinad
Copy link
Contributor Author

Retest this please.

@dalg24
Copy link
Member

dalg24 commented Dec 8, 2021

I decided, for now, to only fix the behavior for Intel GPUs. I'm happy to create a corresponding pull request on the desul side if we are happy with this.

Please do so

@masterleinad
Copy link
Contributor Author

Please do so

See desul/desul#48.

@dalg24 dalg24 merged commit 3a3c033 into kokkos:develop Dec 14, 2021
@masterleinad masterleinad deleted the fix_sycl_local_atomics branch January 6, 2022 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants