SYCL: Improve and simplify parallel_scan implementation #6064

masterleinad · 2023-04-18T17:13:28Z

This pull request changes the previous recursive parallel_scan in the SYCL backend to a two-pass one as we use for Cuda and HIP and SYCL reductions. This simplifies the code (also making it more uniform) and reduces the memory footprint (since we only need to store intermediate results for all items and group scans but not recursive group scans).
On the way, I made sure that all local operations operate on indices of type int avoiding 64-bit index operations.

A second improvement is switching to an auto-detection of the work group size as we do for reductions by querying a dummy kernel for the maximum group size.

Finally, this fixes a couple of unit tests that were failing with SYCL+Cuda since #5707.

algorithms/src/std_algorithms/impl/Kokkos_ExclusiveScan.hpp

masterleinad · 2023-04-21T14:26:17Z

Requires #6065.

core/src/SYCL/Kokkos_SYCL_Instance.cpp

crtrott

AS far as I can tell looks good.

crtrott · 2023-05-04T23:41:40Z

core/src/SYCL/Kokkos_SYCL_Parallel_Scan.hpp

      (global_range + max_subgroup_size - 1) / max_subgroup_size;

-  const auto local_range = sg.get_local_range()[0];
+  const int local_range = sg.get_local_range()[0];


did SYCL change so that you need int now?

No, it's not necessary to do that but I have seen that SYCL is very sensitive to 64-bit index calculations, and local indices surely will never require that.

Just to be clear, auto deduces size_t, not int. I agree we don't need that much of an index range.

dalg24

I do not like the complexity added with the "lambda factory" but I will merge since others seems that it is fine.

dalg24 · 2023-05-09T16:29:45Z

The SYCL build passed. Ignoring the rest.

masterleinad force-pushed the sycl_improve_parallel_scan_new branch from e3c9427 to 6cc246a Compare April 18, 2023 17:18

masterleinad commented Apr 18, 2023

View reviewed changes

algorithms/src/std_algorithms/impl/Kokkos_ExclusiveScan.hpp Outdated Show resolved Hide resolved

masterleinad mentioned this pull request Apr 18, 2023

Fix join for ValueWrapperForNoNeutralElement #6065

Merged

masterleinad force-pushed the sycl_improve_parallel_scan_new branch from 3354928 to e07be73 Compare April 20, 2023 21:03

masterleinad changed the title ~~[WIP] SYCL: Improve and simplify parallel_scan implementation~~ SYCL: Improve and simplify parallel_scan implementation Apr 20, 2023

masterleinad marked this pull request as ready for review April 20, 2023 21:16

masterleinad mentioned this pull request Apr 24, 2023

SYCL: Use in-order queue for SYCL+Cuda #6074

Merged

dalg24 reviewed Apr 25, 2023

View reviewed changes

core/src/SYCL/Kokkos_SYCL_Instance.cpp Show resolved Hide resolved

masterleinad added 2 commits April 28, 2023 14:16

Improve SYCL parallel_scan

3cc9915

Compiling with auto deduction of workgroup sizes

bdaa12c

masterleinad force-pushed the sycl_improve_parallel_scan_new branch from 2144d9b to bdaa12c Compare April 28, 2023 18:16

crtrott approved these changes May 4, 2023

View reviewed changes

masterleinad requested a review from nliber May 5, 2023 21:37

nliber approved these changes May 8, 2023

View reviewed changes

dalg24 reviewed May 9, 2023

View reviewed changes

dalg24 merged commit 6ede773 into kokkos:develop May 9, 2023
25 of 26 checks passed

masterleinad mentioned this pull request May 25, 2023

CHANGELOG: 4.1.0 #5902

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SYCL: Improve and simplify parallel_scan implementation #6064

SYCL: Improve and simplify parallel_scan implementation #6064

masterleinad commented Apr 18, 2023 •

edited

masterleinad commented Apr 21, 2023

crtrott left a comment

crtrott May 4, 2023

masterleinad May 5, 2023

nliber May 8, 2023

dalg24 left a comment

dalg24 commented May 9, 2023

SYCL: Improve and simplify parallel_scan implementation #6064

SYCL: Improve and simplify parallel_scan implementation #6064

Conversation

masterleinad commented Apr 18, 2023 • edited

masterleinad commented Apr 21, 2023

crtrott left a comment

Choose a reason for hiding this comment

crtrott May 4, 2023

Choose a reason for hiding this comment

masterleinad May 5, 2023

Choose a reason for hiding this comment

nliber May 8, 2023

Choose a reason for hiding this comment

dalg24 left a comment

Choose a reason for hiding this comment

dalg24 commented May 9, 2023

masterleinad commented Apr 18, 2023 •

edited