-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix combined reductions with Kokkos::View #4896
Fix combined reductions with Kokkos::View #4896
Conversation
core/unit_test/TestReduce.hpp
Outdated
@@ -635,7 +635,7 @@ TEST(TEST_CATEGORY, int_combined_reduce_mixed) { | |||
|
|||
uint64_t nsum = (nw / 2) * (nw + 1); | |||
|
|||
auto result1_v = Kokkos::View<int64_t, Kokkos::HostSpace>{"result1_v"}; | |||
auto result1_v = Kokkos::View<int64_t>{"result1_v"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the test. This looks intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to demonstrate the issue that I encountered in #4889 that was failing with an incompatible View assignment error (without changing anything in this test). Is there are reason to only allow HostSpace
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Can we change the test to take four reduction args, i.e. result1_v with defaultspace, result2_v with HostSpace, result3 as a scalar, and a fourth using the reducers (which is wrapping a host view)?
core/unit_test/TestReduce.hpp
Outdated
@@ -635,7 +635,7 @@ TEST(TEST_CATEGORY, int_combined_reduce_mixed) { | |||
|
|||
uint64_t nsum = (nw / 2) * (nw + 1); | |||
|
|||
auto result1_v = Kokkos::View<int64_t, Kokkos::HostSpace>{"result1_v"}; | |||
auto result1_v = Kokkos::View<int64_t>{"result1_v"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intentional.
core/unit_test/TestReduce.hpp
Outdated
@@ -635,7 +635,7 @@ TEST(TEST_CATEGORY, int_combined_reduce_mixed) { | |||
|
|||
uint64_t nsum = (nw / 2) * (nw + 1); | |||
|
|||
auto result1_v = Kokkos::View<int64_t, Kokkos::HostSpace>{"result1_v"}; | |||
auto result1_v = Kokkos::View<int64_t>{"result1_v"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. Can we change the test to take four reduction args, i.e. result1_v with defaultspace, result2_v with HostSpace, result3 as a scalar, and a fourth using the reducers (which is wrapping a host view)?
This will require some more changes so I'll convert it to a draft again. |
9ddebb1
to
d7528f6
Compare
d7528f6
to
28cb6bd
Compare
It turned out that making this work for memory spaces that are not host-accessible is not that trivial. The combined reducer implementation basically forms a reducer that lives in host memory space. To construct the underlying values, the implementation so far tried to access the VIew via the call operator (in host execution space) if a View was given as reducer which obviously doesn't work if the View is allocated in non-host-accessible memory space. In the end, using View with non-host-accessible memory space in combined reductions is more expensive than a host-accessible View + deep_copy. Hence, we could also decide to restrict combined reductions to host-accessible memory spaces. Note that the original fix changing the memory space of the implicitly constructed view type is still required for #4889. |
Retest this please. |
Only building the |
51ec1f6
to
e3fdc3c
Compare
Retest this please. |
e3fdc3c
to
f6a49d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now. Can you think of a better name for write_one_value_back
?
Did we ever discuss the semantics when mixing value and view? Obviously now, if one of the argument is a view, then it is asynchronous.
@@ -266,13 +265,26 @@ struct CombinedReducerImpl<std::integer_sequence<size_t, Idxs...>, Space, | |||
return m_value_view; | |||
} | |||
|
|||
KOKKOS_FUNCTION | |||
template <class ExecutionSpace, int Idx, class View> | |||
constexpr static void write_one_value_back( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr
is superfluous. There is no way it works in a constexpr
context, neither the deep copy nor the assign to rank-0 view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only added it because the calling one was constexpr
. Do you want me to drop it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say drop it.
And if the caller (I assume that is write_value_back_to_original_references(...)
) is constexpr
, that can't be correct now either, and should also have constexpr
dropped.
I thought it's an appropriate name given that it's called by
I would think that we as a whole didn't spend much time talking about the semantics of combined reductions at all. I would rather say that we need to fence whenever one of the arguments is a value. |
124dbac
to
d80d065
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing pending change I had. Haven't reviewed the whole thing.
@@ -266,13 +265,26 @@ struct CombinedReducerImpl<std::integer_sequence<size_t, Idxs...>, Space, | |||
return m_value_view; | |||
} | |||
|
|||
KOKKOS_FUNCTION | |||
template <class ExecutionSpace, int Idx, class View> | |||
constexpr static void write_one_value_back( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say drop it.
And if the caller (I assume that is write_value_back_to_original_references(...)
) is constexpr
, that can't be correct now either, and should also have constexpr
dropped.
@nliber I dropped |
Extracted from #4889. We should preserve the memory space in combined reductions if a View is given.