Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SYCL compiler and remove printf guarding with SYCL #614

Merged
merged 4 commits into from
Jan 24, 2022

Conversation

aprokop
Copy link
Contributor

@aprokop aprokop commented Jan 6, 2022

Fix #613.

@aprokop aprokop added the testing Anything to do with tests and CI label Jan 6, 2022
@aprokop aprokop changed the title Update SYCL compiler and remove printf workarounds Update SYCL compiler and remove printf guarding with SYCL Jan 6, 2022
@masterleinad
Copy link
Collaborator

You need to build Kokkos without deprecated code with this compiler version. It's a compiler bug.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 6, 2022

You need to build Kokkos without deprecated code with this compiler version. It's a compiler bug.

What is the option to disable deprecated code? -D Kokkos_ENABLE_DEPRECATED_CODE_3=OFF?

@masterleinad
Copy link
Collaborator

What is the option to disable deprecated code? -D Kokkos_ENABLE_DEPRECATED_CODE_3=OFF?

Yes, the configuration we were using in the Kokkos CI is
https://github.com/kokkos/kokkos/blob/fea35745dfad72decfc4516c4d026d4a0899af4e/.jenkins#L69-L83

@masterleinad
Copy link
Collaborator

Note that we will still get deprecation warnings unless we update to the latest Kokkos release.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 6, 2022

Kokkos 3.4 fails with

/scratch/kokkos/core/src/impl/Kokkos_Timer.hpp:58:15: error: no member named 'Timer' in namespace 'Kokkos'
using Kokkos::Timer;
      ~~~~~~~~^

@masterleinad
Copy link
Collaborator

Kokkos 3.4 fails with

That should not happen with deprecated code disabled...

@aprokop
Copy link
Contributor Author

aprokop commented Jan 6, 2022

That should not happen with deprecated code disabled...

The deprecated code protection in kokkos/core/src/impl/Kokkos_Timer.hpp only appeared in Kokkos 3.5. In 3.4, it is an unprotected using Kokkos::Timer;.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 6, 2022

With Kokkos 3.5, the container builds. However, getting "unnamed lambda" errors from ArborX SYCL build:

In file included from /var/jenkins/workspace/ArborX_PR-614/build/test/tstQueryTreeDegenerate_BF.cpp:1:
In file included from /var/jenkins/workspace/ArborX_PR-614/src/ArborX_BruteForce.hpp:15:
In file included from /var/jenkins/workspace/ArborX_PR-614/src/details/ArborX_AccessTraits.hpp:15:
In file included from /var/jenkins/workspace/ArborX_PR-614/src/details/ArborX_Box.hpp:15:
In file included from /var/jenkins/workspace/ArborX_PR-614/src/details/ArborX_DetailsKokkosExtArithmeticTraits.hpp:15:
In file included from /opt/kokkos/include/Kokkos_Macros.hpp:110:
In file included from /opt/kokkos/include/KokkosCore_Config_SetupBackend.hpp:47:
In file included from /opt/kokkos/include/setup/Kokkos_Setup_SYCL.hpp:48:
In file included from /opt/sycl/bin/../include/sycl/CL/sycl.hpp:15:
In file included from /opt/sycl/bin/../include/sycl/CL/sycl/backend.hpp:26:
In file included from /opt/sycl/bin/../include/sycl/CL/sycl/queue.hpp:19:
/opt/sycl/bin/../include/sycl/CL/sycl/handler.hpp:1029:52: error: unnamed lambda 'sycl::detail::__pf_kernel_wrapper<oneapi::dpl::__par_backend_hetero::__parallel_sort_copy_back_kernel<oneapi::dpl::execution::DefaultKernelName, oneapi::dpl::__ranges::zip_view<oneapi::dpl::__ranges::guard_view<unsigned int *>, oneapi::dpl::__ranges::guard_view<unsigned int *>>, oneapi::dpl::__par_backend_hetero::__full_merge_kernel, (lambda at /var/jenkins/workspace/ArborX_PR-614/src/details/ArborX_DetailsSortUtils.hpp:180:7)>>' used
      kernel_parallel_for<KernelName, ElementType>(KernelFunc);

This is coming from this code:

oneapi::dpl::sort(
policy, zipped_begin, zipped_begin + n,
[](auto lhs, auto rhs) { return std::get<0>(lhs) < std::get<0>(rhs); });

There is an fsycl-unnamed-lambda option, but it's ON by default, so unnamed lambdas should be allowed. Will try to enable them explicitly.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 6, 2022

Explicitly enabling -fsycl-unnamed-lambda did not help. Likely, because the message is confusing.

Trying a patch where lambda is not anonymous.

@aprokop aprokop force-pushed the sycl_update branch 2 times, most recently from e0200b2 to a07bf60 Compare January 6, 2022 21:04
oneapi::dpl::sort(
policy, zipped_begin, zipped_begin + n,
[](auto lhs, auto rhs) { return std::get<0>(lhs) < std::get<0>(rhs); });
oneapi::dpl::sort(policy, zipped_begin, zipped_begin + n, SYCLCompare{});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds more like a bug with the toolchain. We do have -fsycl-unnamed-lambda in the compile line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. I'm not sure sure what to think of intel/llvm#4769. Lets see if it fixes it. If not, I'll try to put the latest nightly and check.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 6, 2022

Cannot build Kokkos 3.5 with the latest SYCL nightly (20220106):

/scratch/kokkos/core/src/desul/atomics/SYCLConversions.hpp:48:20: error: no member named 'ONEAPI' in namespace 'sycl'
  static constexpr DESUL_SYCL_NAMESPACE::memory_scope value = DESUL_SYCL_NAMESPACE::memory_scope::device;
                   ^~~~~~~~~~~~~~~~~~~~
/scratch/kokkos/core/src/desul/atomics/Macros.hpp:38:36: note: expanded from macro 'DESUL_SYCL_NAMESPACE'
#define DESUL_SYCL_NAMESPACE sycl::ONEAPI
                             ~~~~~~^

@aprokop aprokop marked this pull request as draft January 7, 2022 00:26
@aprokop
Copy link
Contributor Author

aprokop commented Jan 7, 2022

This seems too brittle to continue trying to figure out. We could probably work around with a custom comparator, but that's simply another workaround. I think we should not spend any more time on this, and retry it in 6 months or a year. Maybe SYCL compiler would be more stable by then.

@masterleinad
Copy link
Collaborator

To me, it looks like we are almost there. I just don't have any idea what all this linking with omp is about. In the end, it makes sense to me to update after the next Kokkos release.

@masterleinad
Copy link
Collaborator

Let's see if setting the host implementation for oneDPL to "serial" helps, see https://github.com/oneapi-src/oneDPL/blob/main/cmake/README.md#using-onedpl-package.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 10, 2022

Let's see if setting the host implementation for oneDPL to "serial" helps, see oneapi-src/oneDPL@main/cmake/README.md#using-onedpl-package.

But even if it helps, I assume we need a parallel backend (which backend does it actually use on the Intel GPU, "openmp"?).

@masterleinad
Copy link
Collaborator

But even if it helps, I assume we need a parallel backend (which backend does it actually use on the Intel GPU, "openmp"?).

We are explicitly requesting dpcpp_only at compile time. My understanding is that ONEDPL_PAR_BACKEND really only affects host execution which we don't use anyway, also see https://github.com/oneapi-src/oneDPL/blob/394ae95fd935ac3532b129d920135433f42bc9eb/include/oneapi/dpl/pstl/parallel_backend.h.

It seems that we are now down to some performance regressions.

@aprokop
Copy link
Contributor Author

aprokop commented Jan 10, 2022

It seems that we are now down to some performance regressions.

What do you mean?

@masterleinad
Copy link
Collaborator

What do you mean?

This pull request now compiles and runs but ArborX_QueryTree_Test and ArborX_BruteForce_Example are timing out.

@masterleinad
Copy link
Collaborator

We were running into a deadlock with the fences in the SYCL ThreadVectorRange implementation for which there is no good solution, yet (with the current compiler version). The workaround is don't required on Intel GPUs interestingly.

@masterleinad
Copy link
Collaborator

For reference, we need intel/llvm#4904 or intel/llvm#4903 (including an actual implementation) to fix the current behavior in Kokkos.

Copy link
Contributor Author

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than two minor comments, I think this is fine.

-D CMAKE_CXX_FLAGS="-Wpedantic -Wall -Wextra -Wno-unknown-cuda-version -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice" \
-D CMAKE_CXX_FLAGS="-Wpedantic -Wall -Wextra -Wno-unknown-cuda-version" \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to remove these?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't need to remove these but there is no reason to keep it IMHO. These flags are(or should be) exported by Kokkos.

.jenkins/continuous.groovy Show resolved Hide resolved
@aprokop
Copy link
Contributor Author

aprokop commented Jan 24, 2022

@masterleinad OK, can you please squash the commits and restore other testing? Or, let me know and I'll do this myself.

@masterleinad
Copy link
Collaborator

@masterleinad OK, can you please squash the commits and restore other testing? Or, let me know and I'll do this myself.

Done.

@aprokop aprokop marked this pull request as ready for review January 24, 2022 19:06
@aprokop
Copy link
Contributor Author

aprokop commented Jan 24, 2022

retest this please

@aprokop
Copy link
Contributor Author

aprokop commented Jan 24, 2022

Huh, just noticed that SYCL build time dropped from 30 minutes to 11.

Copy link
Contributor

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you add the comment about the Kokkos git commit hash used when building the Docker image

@dalg24 dalg24 merged commit 5513b63 into arborx:master Jan 24, 2022
@aprokop aprokop deleted the sycl_update branch January 24, 2022 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Anything to do with tests and CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get rid of printf guarding with SYCL
3 participants