Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize KOKKOS package for small systems #1422

Merged
merged 44 commits into from Jun 11, 2019

Conversation

@stanmoore1
Copy link
Contributor

commented Apr 8, 2019

Summary

This PR adds several optimizations to the KOKKOS package for small systems when running on GPUs

  • an option to thread over neighbors in addition to threading over atoms
  • fused forward communication pack/unpack that reduces the number of CUDA kernel launches when running on a single MPI rank
  • many other small optimizations

Author(s)

Stan Moore (Sandia)

Licensing

By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under either the GNU General Public License version 2 (GPL v2) or the GNU Lesser General Public License version 2.1 (LGPL v2.1).

Backward Compatibility

No issues.

stanmoore1 added some commits Apr 8, 2019

@stanmoore1

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2019

I'm done coding, running regression and performance tests now.

@stanmoore1

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2019

For 1000 Lennard-Jones atoms, I'm seeing a little over 2x speedup on a single V100 GPU from this PR.

stanmoore1 added some commits Apr 9, 2019

@stanmoore1

This comment has been minimized.

Copy link
Contributor Author

commented May 16, 2019

Some regression tests are still failing with this PR, looking into this.

@akohlmey akohlmey added this to the Stable Release Summer 2019 milestone May 21, 2019

@stanmoore1 stanmoore1 force-pushed the stanmoore1:team_opt branch from e963b50 to 3b60686 May 29, 2019

@stanmoore1 stanmoore1 requested a review from akohlmey May 29, 2019

@stanmoore1 stanmoore1 assigned akohlmey and unassigned stanmoore1 May 29, 2019

@akohlmey akohlmey requested review from rbberger and athomps Jun 10, 2019

@akohlmey akohlmey merged commit e72ac92 into lammps:master Jun 11, 2019

6 checks passed

lammps/pull-requests/build-docs-pr head run ended
Details
lammps/pull-requests/cmake/cmake-serial-pr head run ended
Details
lammps/pull-requests/kokkos-omp-pr head run ended
Details
lammps/pull-requests/openmpi-pr head run ended
Details
lammps/pull-requests/serial-pr head run ended
Details
lammps/pull-requests/shlib-pr head run ended
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.