Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use primary context in CUDA GPU code. #3432

Merged
merged 3 commits into from Sep 9, 2022
Merged

Conversation

benmenadue
Copy link
Contributor

@benmenadue benmenadue commented Sep 5, 2022

Summary

This patch changes the CUDA context creation inside LAMMPS to instead use
the "primary" context for the requested device; this then makes LAMMPS
compatible with other libraries that also use CUDA via its runtime API.

It's not a complete fix, in that other libraries that use their own context
via the driver API could still break LAMMPS (e.g. if they switch to their
own context after LAMMPS has initialised its and don't switch it back before
returning), but it at least covers packages that use the CUDA runtime
library.

Unfortunately, to make all driver calls in LAMMPS "context aware" might
require more significant changes to the interfaces as the current context
may not be available at the calls (for example, the _device_alloc functions
in nvd_memory.h that don't take a UCL_Device argument).

Related Issue(s)

Fixes: #3431

Author(s)

Ben Menadue, National Computational Infrastructure, Australian National University

Licensing

By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under either the GNU General Public License version 2 (GPL v2) or the GNU Lesser General Public License version 2.1 (LGPL v2.1).

Backward Compatibility

These changes should not have any impact on backwards compatibility.

Implementation Notes

As recommended in the CUDA documentation, switches it to use cuDevicePrimaryCtxRetain and cuDevicePrimaryCtxRelease instead of cuCtxCreate and cuCtxDestroy.

Post Submission Checklist

  • The feature or features in this pull request is complete
  • Licensing information is complete
  • Corresponding author information is complete
  • The source code follows the LAMMPS formatting guidelines
  • Suitable new documentation files and/or updates to the existing docs are included
  • The added/updated documentation is integrated and tested with the documentation build system
  • The feature has been verified to work with the conventional build system
  • The feature has been verified to work with the CMake based build system
  • Suitable tests have been added to the unittest tree.
  • A package specific README file has been included or updated
  • One or more example input decks are included

Since LAMMPS uses the low-level driver API of CUDA, it needs to ensure
that it is in the correct context when invoking such functions. At the
moment it creates and switches to its own context inside `UCL_Device::set`
but then assumes that the driver is still in that context for subsequent
calls into CUDA; if another part of the program uses a different context
(such as the CUDA runtime using the "primary" context) this will cause
failures inside LAMMPS.

This patch changes the context creation to instead use the primary
context for the requested device. While it's not perfect, in that it
still doesn't ensure that it's in the correct context before making
driver API calls, it at least allows it to work with libraries that use
the runtime API.
@akohlmey akohlmey self-assigned this Sep 6, 2022
@akohlmey akohlmey added gpu_package gpu_unit_tests Enable to trigger GPU unit tests labels Sep 6, 2022
@akohlmey akohlmey added this to the Stable Release Spring 2023 milestone Sep 6, 2022
@akohlmey akohlmey removed the gpu_unit_tests Enable to trigger GPU unit tests label Sep 7, 2022
Copy link
Collaborator

@wmbrownIntel wmbrownIntel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benmenadue I think it is OK to make this the default behavior, but please add preprocessor option for original behavior (e.g. #ifndef GERYON_UNIQUE_CONTEXT). Thanks!

@akohlmey akohlmey added the gpu_unit_tests Enable to trigger GPU unit tests label Sep 9, 2022
@akohlmey
Copy link
Member

akohlmey commented Sep 9, 2022

@benmenadue I think it is OK to make this the default behavior, but please add preprocessor option for original behavior (e.g. #ifndef GERYON_UNIQUE_CONTEXT). Thanks!

@wmbrownIntel implemented your request

@akohlmey akohlmey removed the gpu_unit_tests Enable to trigger GPU unit tests label Sep 9, 2022
@akohlmey akohlmey self-requested a review September 9, 2022 19:59
@akohlmey akohlmey merged commit 1364033 into lammps:develop Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] CUDA kernel launch failure with OpenMPI 4.1.4
3 participants