Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid sharing handles for the same device between threads #1109

kmaehashi opened this issue Apr 6, 2018 · 2 comments


None yet
2 participants
Copy link

commented Apr 6, 2018

Currently, when the user code is multi-threaded AND each thread uses the same GPU device, the same handle is used between threads for library calls. However, currently we are calling library functions using with nogil.

This means that there may be two issues:

  1. Library function invocations using the same handle may run concurrently.
  2. Stream set via setStream call (prior to the library function invocation) may be overwritten by the subsequent call.

Here is a quote of Thread Safetiness section from API reference of each library.


The library is thread safe and its functions can be called from multiple host threads, even with the same handle. When multiple threads share the same handle, extreme care needs to be taken when the handle configuration is changed because that change will affect potentially subsequent CUBLAS calls in all threads. It is even more true for the destruction of the handle. So it is not recommended that multiple thread share the same CUBLAS handle.

Both issue 1. and 2.
(we're using cublasSetMathMode which changes the configuration for the handle)


For multithreaded applications that use the same device from different threads, the recommended programming model is to create one (or a few, as is convenient) cuDNN handle(s) per thread and use that cuDNN handle for the entire life of the thread.

The library is thread safe and its functions can be called from multiple host threads, as long as threads to do not share the same cuDNN handle simultaneously.

Both issue 1. and 2.

cuFFT (handle = plan)

cuFFT APIs are thread safe as long as different host threads execute FFTs using different plans and the output data are disjoint.

No issue in CuPy, as we can expect that Plan1d (our private API that creates plan) instances are not shared between threads.


The library is thread safe and its functions can be called from multiple host threads.

Maybe both issue 1 and 2?


The library is thread safe and its functions can be called from multiple host threads.

No issue in CuPy, as our cuSPARSE invocation currently does not use with nogil.

cuRAND (handle = generator)

Unclear; no description on API reference.


This comment has been minimized.

Copy link

commented Feb 25, 2019

I can confirm both 1 and 2 apply for cuSOLVER as well, and it is the issue in #2045. One way I see of solving this is to keep two dictionaries, one for handles in use and the other for already allocated but free handles, assuming we don't want to create/release new handles for every operation. Perhaps, we would want a more sophisticated handle manager, to avoid keeping countless of them that never get released.

I can work on such a solution, but I would like some additional input, perhaps there is already some other ongoing discussion?

@mrocklin @anaruse FYI


This comment has been minimized.

Copy link
Member Author

commented Feb 26, 2019

Thanks for the insights and reproduction code!
I started working on this issue in #2053.

@kmaehashi kmaehashi self-assigned this Feb 26, 2019

@kmaehashi kmaehashi added cat:bug and removed cat:enhancement labels Feb 26, 2019

@okuta okuta closed this in #2053 Apr 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.