Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swap x- and z-indices in SYCL MDRangePolicy #4161

Merged
merged 1 commit into from
Jul 20, 2021

Conversation

masterleinad
Copy link
Contributor

Compared to CUDA, SYCL reverses the order of blocks and threads when mapping to the hardware, see https://www.codeplay.com/portal/blogs/2019/11/18/computecpp-v1-1-6-changes-to-work-item-mapping-optimization.html. We need to take this into account for multi-dimensional sycl::nd_range ranges. This pull request deals with the MDRangePolicy part.

@masterleinad masterleinad requested a review from nliber July 14, 2021 16:14
@masterleinad masterleinad added this to Awaiting Feedback in Developer: Daniel Arndt Jul 14, 2021
@DavidPoliakoff
Copy link
Contributor

So, we've talked in Slack about the wisdom of this X-Z swap. Do we have confidence that all the implementations we'll care about will have good performance from this? Do we think this swap is something they'll be consistent on, or something they'll change in time?

@crtrott crtrott added this to In progress in Kokkos Release 3.5 via automation Jul 14, 2021
@crtrott crtrott moved this from In progress to Awaiting Feedback in Kokkos Release 3.5 Jul 14, 2021
@crtrott crtrott self-requested a review July 14, 2021 18:59
@joeatodd
Copy link
Contributor

I've tested this locally and it does the trick for the MDRange. Alongside my local changes to Kokkos_SYCL_Parallel_Team.hpp and Kokkos_SYCL_Team.hpp, it fixes the crashes we've seen in large LAMMPS runs. It's also a lot tidier than the implementation I wrote!

I suspect that, so long as other applications are using the default LayoutLeft, this should improve performance. @masterleinad I think you mentioned that you'd seen improved performance in your tests, right?

As to the permanence of this change, my understanding is that this is how the SYCL spec was always defined, and the change in 2019 simply brings the implementations in line with the standard. As such, I am quite confident that SYCL will remain row-major.

@masterleinad
Copy link
Contributor Author

I suspect that, so long as other applications are using the default LayoutLeft, this should improve performance. @masterleinad I think you mentioned that you'd seen improved performance in your tests, right?

I checked that the performance results generated by the tests in https://github.com/kokkos/kokkos/tree/master/core/perf_test are more in line with the CUDA results, i.e. copying LayoutRight to LayoutLeft is faster than copying LayoutLeft to LayoutRight. For these tests, I didn't really see a general performance benefit, though.

@dalg24 dalg24 merged commit ba0f4bb into kokkos:develop Jul 20, 2021
Kokkos Release 3.5 automation moved this from Awaiting Feedback to Done Jul 20, 2021
@masterleinad masterleinad moved this from Awaiting Feedback to Done in Developer: Daniel Arndt Jul 21, 2021
@masterleinad masterleinad deleted the swap_md_range_policy branch December 2, 2021 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants