-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memset builtins #45
Memset builtins #45
Conversation
Thanks @abagusetty! I think I'll have time to work on this today. I looked up C's For SYCL, do we want to specialize the template for Regarding CUDA and HIP, memset is synchronous, and I need to do some testing to see what the cost / benefit is of causing a sync versus faster memory filling. I might create a CPP define to control this behavior for users who feel the need to speed-up memset with integers who do not mind a synchronization point. Thanks for catching the printf issue. I took it for granted that SYCL would allow it on the device. Is this being planned in later SYCL specs? It would be useful. |
This might be a cleaner option. All forms (CUDA, HIP, SYCL, HOST) of
An experimental form of |
Ah, it looks like
Having an issue open for better |
I have to admit, though, I'm confused by the memset documentation. They keep saying "byte" value, but the input is |
Yeah, we'll just use memset for zero values then... Back to what you had originally. |
@@ -115,72 +115,17 @@ | |||
sycl::access::address_space::global_space> | |||
using relaxed_atomic_ref = | |||
sycl::ext::oneapi::atomic_ref< T, | |||
sycl::ext::oneapi::memory_order::relaxed, | |||
sycl::ext::oneapi::memory_order::seq_cst, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abagusetty , why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was mostly for better stability for atomics for future SDK update and hardware releases in upcoming weeks. Currently this change would be no-different than previous relaxed
@@ -34,6 +34,12 @@ | |||
#define YAKL_CURRENTLY_ON_HOST() (! defined(__SYCL_DEVICE_ONLY__)) | |||
#define YAKL_CURRENTLY_ON_DEVICE() (defined(__SYCL_DEVICE_ONLY__)) | |||
|
|||
#ifdef __SYCL_DEVICE_ONLY__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CONSTANT
doesn't appear to be used anywhere. Is this for future SYCL work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, the CONSTANT
was for the use case of sycl::printf
, https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/oneapi/experimental/builtins.hpp#L67. For eg:
const CONSTANT char format[] = "KERNEL CHECK FAILED:\n %s\n %s\n";
sycl::ext::oneapi::experimental::printf(format,msg);
Given that printf(%s\n")
with const char *
is not really aligning with the above sycl::printf
signature
All features in this branch have been merged to master. |
YAKL_DEBUG
issue with SYCL backend.printf
is not supported in device code foryakl_throw()