Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid sharing handles for the same device between threads #1109

Closed
kmaehashi opened this issue Apr 6, 2018 · 2 comments
Closed

Avoid sharing handles for the same device between threads #1109

kmaehashi opened this issue Apr 6, 2018 · 2 comments
Assignees
Labels

Comments

@kmaehashi
Copy link
Member

kmaehashi commented Apr 6, 2018

Currently, when the user code is multi-threaded AND each thread uses the same GPU device, the same handle is used between threads for library calls. However, currently we are calling library functions using with nogil.

https://github.com/cupy/cupy/blob/v4.0.0rc1/cupy/cuda/device.pyx#L22-L25
https://github.com/cupy/cupy/blob/v4.0.0rc1/cupy/random/generator.py#L552-L583

This means that there may be two issues:

  1. Library function invocations using the same handle may run concurrently.
  2. Stream set via setStream call (prior to the library function invocation) may be overwritten by the subsequent call.

Here is a quote of Thread Safetiness section from API reference of each library.

cuBLAS

The library is thread safe and its functions can be called from multiple host threads, even with the same handle. When multiple threads share the same handle, extreme care needs to be taken when the handle configuration is changed because that change will affect potentially subsequent CUBLAS calls in all threads. It is even more true for the destruction of the handle. So it is not recommended that multiple thread share the same CUBLAS handle.

Both issue 1. and 2.
(we're using cublasSetMathMode which changes the configuration for the handle)

cuDNN

For multithreaded applications that use the same device from different threads, the recommended programming model is to create one (or a few, as is convenient) cuDNN handle(s) per thread and use that cuDNN handle for the entire life of the thread.

The library is thread safe and its functions can be called from multiple host threads, as long as threads to do not share the same cuDNN handle simultaneously.

Both issue 1. and 2.

cuFFT (handle = plan)

cuFFT APIs are thread safe as long as different host threads execute FFTs using different plans and the output data are disjoint.

No issue in CuPy, as we can expect that Plan1d (our private API that creates plan) instances are not shared between threads.

cuSOLVER

The library is thread safe and its functions can be called from multiple host threads.

Unclear.
Maybe both issue 1 and 2?

cuSPARSE

The library is thread safe and its functions can be called from multiple host threads.

Unclear.
No issue in CuPy, as our cuSPARSE invocation currently does not use with nogil.

cuRAND (handle = generator)

Unclear; no description on API reference.

@pentschev
Copy link
Member

pentschev commented Feb 25, 2019

I can confirm both 1 and 2 apply for cuSOLVER as well, and it is the issue in #2045. One way I see of solving this is to keep two dictionaries, one for handles in use and the other for already allocated but free handles, assuming we don't want to create/release new handles for every operation. Perhaps, we would want a more sophisticated handle manager, to avoid keeping countless of them that never get released.

I can work on such a solution, but I would like some additional input, perhaps there is already some other ongoing discussion?

@mrocklin @anaruse FYI

@kmaehashi
Copy link
Member Author

Thanks for the insights and reproduction code!
I started working on this issue in #2053.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants