-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse memory in SYCL parallel_scan #3899
Reuse memory in SYCL parallel_scan #3899
Conversation
7283ec1
to
2f7e870
Compare
Retest this please. |
// FIXME_SYCL consider only storing one value per block and recreate initial | ||
// results in the end before doing the final pass | ||
m_scratch_space = static_cast<pointer_type>( | ||
instance.scratch_space(sizeof(value_type) * total_memory)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't total_memory
already take into account sizeof(value_type)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::size_t total_memory = 0; | ||
{ | ||
size_t wgroup_size = 32; | ||
size_t n_nested_size = len; | ||
size_t n_nested_wgroups; | ||
do { | ||
total_memory += sizeof(value_type) * n_nested_size; | ||
n_nested_wgroups = (n_nested_size + wgroup_size - 1) / wgroup_size; | ||
n_nested_size = n_nested_wgroups; | ||
} while (n_nested_wgroups > 1); | ||
total_memory += sizeof(value_type) * wgroup_size; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment on what is going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retest this please. |
1 similar comment
Retest this please. |
Based on top of #3873. This should be the last occurrence of a raw call to
sycl::malloc
outside ofSYCLDeviceUSMSpace::allocate()
orSYCLSharedUSMSpace::allocate()