Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix combined reductions with Kokkos::View #4896

Merged
merged 10 commits into from
Jun 20, 2022

Conversation

masterleinad
Copy link
Contributor

Extracted from #4889. We should preserve the memory space in combined reductions if a View is given.

@@ -635,7 +635,7 @@ TEST(TEST_CATEGORY, int_combined_reduce_mixed) {

uint64_t nsum = (nw / 2) * (nw + 1);

auto result1_v = Kokkos::View<int64_t, Kokkos::HostSpace>{"result1_v"};
auto result1_v = Kokkos::View<int64_t>{"result1_v"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the test. This looks intentional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to demonstrate the issue that I encountered in #4889 that was failing with an incompatible View assignment error (without changing anything in this test). Is there are reason to only allow HostSpace here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Can we change the test to take four reduction args, i.e. result1_v with defaultspace, result2_v with HostSpace, result3 as a scalar, and a fourth using the reducers (which is wrapping a host view)?

@@ -635,7 +635,7 @@ TEST(TEST_CATEGORY, int_combined_reduce_mixed) {

uint64_t nsum = (nw / 2) * (nw + 1);

auto result1_v = Kokkos::View<int64_t, Kokkos::HostSpace>{"result1_v"};
auto result1_v = Kokkos::View<int64_t>{"result1_v"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional.

@@ -635,7 +635,7 @@ TEST(TEST_CATEGORY, int_combined_reduce_mixed) {

uint64_t nsum = (nw / 2) * (nw + 1);

auto result1_v = Kokkos::View<int64_t, Kokkos::HostSpace>{"result1_v"};
auto result1_v = Kokkos::View<int64_t>{"result1_v"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Can we change the test to take four reduction args, i.e. result1_v with defaultspace, result2_v with HostSpace, result3 as a scalar, and a fourth using the reducers (which is wrapping a host view)?

@masterleinad
Copy link
Contributor Author

This will require some more changes so I'll convert it to a draft again.

@masterleinad masterleinad marked this pull request as draft March 24, 2022 14:34
@masterleinad masterleinad force-pushed the fix_combined_reductions branch 2 times, most recently from 9ddebb1 to d7528f6 Compare March 24, 2022 20:38
@masterleinad masterleinad marked this pull request as ready for review March 24, 2022 20:41
@masterleinad
Copy link
Contributor Author

It turned out that making this work for memory spaces that are not host-accessible is not that trivial. The combined reducer implementation basically forms a reducer that lives in host memory space. To construct the underlying values, the implementation so far tried to access the VIew via the call operator (in host execution space) if a View was given as reducer which obviously doesn't work if the View is allocated in non-host-accessible memory space.
Since the initial value isn't used anyway, I decided to default-construct one of the desired type instead of copying the one we have stored in the View. Since the value type in a View has to be default-constructible anyway (implicitly in the current implementation), that should work.
When the value stored in the combined reducer is copied back, we run into the same memory access problems and we just perform a Kokkos::deep_copy now if the memory space is not accessible from the host.

In the end, using View with non-host-accessible memory space in combined reductions is more expensive than a host-accessible View + deep_copy. Hence, we could also decide to restrict combined reductions to host-accessible memory spaces. Note that the original fix changing the memory space of the implicitly constructed view type is still required for #4889.

@masterleinad
Copy link
Contributor Author

Retest this please.

@masterleinad
Copy link
Contributor Author

Only building the OpenMPTarget-Clang image failed.

@masterleinad
Copy link
Contributor Author

Retest this please.

core/src/impl/Kokkos_Combined_Reducer.hpp Outdated Show resolved Hide resolved
core/src/impl/Kokkos_Combined_Reducer.hpp Outdated Show resolved Hide resolved
core/src/impl/Kokkos_Combined_Reducer.hpp Outdated Show resolved Hide resolved
@masterleinad masterleinad requested a review from crtrott June 2, 2022 22:13
Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. Can you think of a better name for write_one_value_back?

Did we ever discuss the semantics when mixing value and view? Obviously now, if one of the argument is a view, then it is asynchronous.

@@ -266,13 +265,26 @@ struct CombinedReducerImpl<std::integer_sequence<size_t, Idxs...>, Space,
return m_value_view;
}

KOKKOS_FUNCTION
template <class ExecutionSpace, int Idx, class View>
constexpr static void write_one_value_back(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constexpr is superfluous. There is no way it works in a constexpr context, neither the deep copy nor the assign to rank-0 view

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added it because the calling one was constexpr. Do you want me to drop it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say drop it.

And if the caller (I assume that is write_value_back_to_original_references(...)) is constexpr, that can't be correct now either, and should also have constexpr dropped.

@masterleinad
Copy link
Contributor Author

Looks good now. Can you think of a better name for write_one_value_back?

I thought it's an appropriate name given that it's called by write_value_back_to_original_references().

Did we ever discuss the semantics when mixing value and view? Obviously now, if one of the argument is a view, then it is asynchronous.

I would think that we as a whole didn't spend much time talking about the semantics of combined reductions at all. I would rather say that we need to fence whenever one of the arguments is a value.

Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing pending change I had. Haven't reviewed the whole thing.

core/src/impl/Kokkos_Combined_Reducer.hpp Outdated Show resolved Hide resolved
nliber
nliber previously requested changes Jun 15, 2022
@@ -266,13 +265,26 @@ struct CombinedReducerImpl<std::integer_sequence<size_t, Idxs...>, Space,
return m_value_view;
}

KOKKOS_FUNCTION
template <class ExecutionSpace, int Idx, class View>
constexpr static void write_one_value_back(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say drop it.

And if the caller (I assume that is write_value_back_to_original_references(...)) is constexpr, that can't be correct now either, and should also have constexpr dropped.

@masterleinad masterleinad requested a review from nliber June 15, 2022 21:37
@masterleinad
Copy link
Contributor Author

@nliber I dropped constexpr for the two functions.

@masterleinad
Copy link
Contributor Author

@crtrott @nliber ping I addressed your comments.

@crtrott crtrott merged commit d3c1948 into kokkos:develop Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants