Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix guards for using scratch space with SYCL #6003

Merged

Conversation

masterleinad
Copy link
Contributor

This fixes team_scratch_1_queues for SYCL+Cuda by properly guarding scratch space for TeamPolicy. The changes here mirror for the most part what we do for Cuda and HIP.
Previously, we only had a m_team_scratch_mutex mutex that would be locked in the constructor of a TeamPolicy ParallelFor or ParallelReduce and unlocked in the respective destructor. As the test failure demonstrates this is not sufficient since subsequent calls could resize and invalidate the level 1 team scratch allocation. In Cuda and HIP, this problem is solved by having a pool of scratch allocations that we cycle through until an unused one is found. Similar to what we were previously doing, a scratch allocation would be acquired in the constructor and marked as unused in the destructor. To make appropriate synchronization more obvious, I opted for storing a sycl::event for each item in the pool and waiting for that event before acquiring the respective team allocation.

@masterleinad
Copy link
Contributor Author

Retest this please.

@masterleinad masterleinad marked this pull request as ready for review March 22, 2023 13:08
Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No changes needed to the TeamPolicy parallel_scan?

core/src/SYCL/Kokkos_SYCL_Instance.hpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Instance.cpp Outdated Show resolved Hide resolved
core/src/SYCL/Kokkos_SYCL_Parallel_Team.hpp Show resolved Hide resolved
@dalg24
Copy link
Member

dalg24 commented Mar 28, 2023

I plan on ignoring the Clang+CUDA build failure

@dalg24 dalg24 merged commit 8270db3 into kokkos:develop Mar 28, 2023
@masterleinad masterleinad mentioned this pull request May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants