Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU optimizations for KOKKOS package #357

Merged
merged 6 commits into from Aug 24, 2022
Merged

Conversation

stanmoore1
Copy link
Contributor

@stanmoore1 stanmoore1 commented Aug 22, 2022

Purpose

Several optimizations for NVIDIA V100 GPUs, especially reducing cost of the global particle/reorder option, which allows one to reorder more often, improving performance of the move algorithm due to better data locality.

For the in.collide benchmark with 10 million particles on a V100 GPU, I see 1.45x overall speedup going from the master branch with particle/reorder 10 to this branch with particle/reorder 5.

Also use better default KOKKOS package options for CPUs vs GPUs, and change the comm classic command to comm serial. The classic option is still accepted for backwards compatibility but deprecated.

Author(s)

Stan Moore (SNL), with helpful discussions/ideas from Evan Weinberg (NVIDIA), @weinbe2

Backward Compatibility

Yes

@stanmoore1 stanmoore1 added enhancement New feature or request KOKKOS package labels Aug 22, 2022
@stanmoore1 stanmoore1 self-assigned this Aug 22, 2022
@stanmoore1
Copy link
Contributor Author

This also gives a small speedup on the CPU for Kokkos when using global particle/reorder.

@stanmoore1 stanmoore1 merged commit 9af7460 into sparta:master Aug 24, 2022
@stanmoore1 stanmoore1 deleted the kk_opt branch August 24, 2022 17:23
stanmoore1 added a commit that referenced this pull request Aug 24, 2022
stanmoore1 added a commit that referenced this pull request Aug 24, 2022
Fix issue from #357 and some other small tweaks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request KOKKOS package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants