-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug with CUDA's team reduction for empty ranges #5079
Fix bug with CUDA's team reduction for empty ranges #5079
Conversation
50a0fb4
to
720857b
Compare
looks good to me |
using ReducerType = Kokkos::Sum<double>; | ||
double result = 10.; | ||
ReducerType reducer(result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you bother with a reducer?
Why do you initialize result
at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initialize it so that I can see that it's actually set to zero in the reduction (and not by chance uninitialized zero).
I need the reducer to trigger the bug. Otherwise needs_device_set
would not be true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that is what it would default to if you passed a scalar isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kokkos/core/src/Cuda/Kokkos_Cuda_Parallel_Team.hpp
Lines 866 to 874 in ecbfc2f
const bool is_empty_range = m_league_size == 0 || m_team_size == 0; | |
const bool need_device_set = Analysis::has_init_member_function || | |
Analysis::has_final_member_function || | |
!m_result_ptr_host_accessible || | |
#ifdef KOKKOS_CUDA_ENABLE_GRAPHS | |
Policy::is_graph_kernel::value || | |
#endif | |
!std::is_same<ReducerType, InvalidType>::value; | |
if (!is_empty_range || need_device_set) { |
Using a reducer I get both
has_init_member_function
and has_reducer
. If I only provide a scalar need_device_set
is false.
init: 1
final: 0
host accessible: 1
has reducer: 1
for the reducer case for the different parts of need_device_set
init: 0
final: 0
host accessible: 1
has reducer: 0
for the scalar case.
Is the file |
It is not new. Note that it is a private header so it does not exist as far as users are concerned.
You will have to wait for Kokkos 3.7 (we expect to create a release candidate branch and start testing in two weeks) or try snapshotting on your own. |
@dalg24 Since it's a private header, I'm unsure how the snapshotting works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
5a2b7f4
to
1d67306
Compare
Retest this please |
Not sure why the status wasn't updated but the tests did pass |
Fixes #5075. This was only triggered when scratch memory was requested.