Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SYCL custom reductions required for ArborX #3544

Merged
merged 3 commits into from
Nov 12, 2020

Conversation

masterleinad
Copy link
Contributor

Based on #3490 and slightly on #3542.

Copy link
Contributor

@nliber nliber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me.

@masterleinad masterleinad force-pushed the sycl_custom_reductions_clean branch 2 times, most recently from 2ed9f3f to 3a6f355 Compare November 5, 2020 18:09
@masterleinad masterleinad marked this pull request as ready for review November 5, 2020 18:10
@dalg24 dalg24 requested a review from nliber November 5, 2020 18:23
Copy link
Contributor

@nliber nliber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@masterleinad
Copy link
Contributor Author

It seems that the detection mechanism for ReduceFunctorHas* is really brittle but the extension I need here works for clang so I only enable it for this backend. If someone has a better idea, I am happy to take that. Also, the relevant tests are enabled in 6aedc5c (#3544) but these tests were not working correctly on the GPU but only in the CI. Hence, 3a6f355 (#3544) reverts that commit.

if constexpr (ReduceFunctorHasJoin<Functor>::value) {
return cl::sycl::ONEAPI::reduction(
result_ptr, identity,
[=](value_type& old_value, const value_type& new_value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment why you are capturing by value here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/jenkins/workspace/Kokkos/core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp:170:19: error: 'const Kokkos::Impl::ParallelReduce<Test::TestViewOperator_LeftAndRight<int [2][3][4][2][3][4], Kokkos::Experimental::SYCL, 6>, Kokkos::RangePolicy<Kokkos::Experimental::SYCL>, Kokkos::InvalidType, Kokkos::Experimental::SYCL>::ExtendedReferenceWrapper<Test::TestViewOperator_LeftAndRight<int [2][3][4][2][3][4], Kokkos::Experimental::SYCL, 6>> &' cannot be used as the type of a kernel parameter
                  functor.join(old_value, new_value);
                  ^

core/src/SYCL/Kokkos_SYCL_Parallel_Reduce.hpp Show resolved Hide resolved
core/src/impl/Kokkos_FunctorAdapter.hpp Outdated Show resolved Hide resolved
@masterleinad masterleinad force-pushed the sycl_custom_reductions_clean branch 2 times, most recently from 9f67b70 to e21ba68 Compare November 10, 2020 13:38
@@ -83,7 +83,7 @@ SYCL::SYCL() : m_space_instance(&Impl::SYCLInternal::singleton()) {

int SYCL::concurrency() {
// FIXME_SYCL
return 1;
return 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a fixme but one of the test was failing because asserting that the concurrency is greater than one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a more detailed comment than "FIXME" here, something like "this is to work around some tests that assert concurrency must be greater than 1"

@crtrott
Copy link
Member

crtrott commented Nov 10, 2020

I find this change in AnalyzeFunctor highly suspect. On the other hand that whole thing is highly suspect and Daisy wants to rewrite it from scratch so ...


value_type host_result = identity;
if constexpr (ReduceFunctorHasInit<Functor>::value)
ValueInit::init(functor, &host_result);
q.memcpy(result_ptr, &host_result, sizeof(host_result));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just use assignment here *result_ptr = host_result;, mainly because the rest of the code also takes advantage of it being in USM shared memory (*m_result_ptr = *result_ptr;, for instance).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved in e297544 (#3544)

Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo history cleanup

@dalg24
Copy link
Member

dalg24 commented Nov 12, 2020

Plz cleanup history and I will merge

@masterleinad
Copy link
Contributor Author

@dalg24 Here you go!

@dalg24 dalg24 merged commit 526220f into kokkos:develop Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants