You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when the user code is multi-threaded AND each thread uses the same GPU device, the same handle is used between threads for library calls. However, currently we are calling library functions using with nogil.
Library function invocations using the same handle may run concurrently.
Stream set via setStream call (prior to the library function invocation) may be overwritten by the subsequent call.
Here is a quote of Thread Safetiness section from API reference of each library.
cuBLAS
The library is thread safe and its functions can be called from multiple host threads, even with the same handle. When multiple threads share the same handle, extreme care needs to be taken when the handle configuration is changed because that change will affect potentially subsequent CUBLAS calls in all threads. It is even more true for the destruction of the handle. So it is not recommended that multiple thread share the same CUBLAS handle.
Both issue 1. and 2.
(we're using cublasSetMathMode which changes the configuration for the handle)
cuDNN
For multithreaded applications that use the same device from different threads, the recommended programming model is to create one (or a few, as is convenient) cuDNN handle(s) per thread and use that cuDNN handle for the entire life of the thread.
The library is thread safe and its functions can be called from multiple host threads, as long as threads to do not share the same cuDNN handle simultaneously.
Both issue 1. and 2.
cuFFT (handle = plan)
cuFFT APIs are thread safe as long as different host threads execute FFTs using different plans and the output data are disjoint.
No issue in CuPy, as we can expect that Plan1d (our private API that creates plan) instances are not shared between threads.
cuSOLVER
The library is thread safe and its functions can be called from multiple host threads.
Unclear.
Maybe both issue 1 and 2?
cuSPARSE
The library is thread safe and its functions can be called from multiple host threads.
Unclear.
No issue in CuPy, as our cuSPARSE invocation currently does not use with nogil.
cuRAND (handle = generator)
Unclear; no description on API reference.
The text was updated successfully, but these errors were encountered:
I can confirm both 1 and 2 apply for cuSOLVER as well, and it is the issue in #2045. One way I see of solving this is to keep two dictionaries, one for handles in use and the other for already allocated but free handles, assuming we don't want to create/release new handles for every operation. Perhaps, we would want a more sophisticated handle manager, to avoid keeping countless of them that never get released.
I can work on such a solution, but I would like some additional input, perhaps there is already some other ongoing discussion?
Currently, when the user code is multi-threaded AND each thread uses the same GPU device, the same handle is used between threads for library calls. However, currently we are calling library functions using
with nogil
.https://github.com/cupy/cupy/blob/v4.0.0rc1/cupy/cuda/device.pyx#L22-L25
https://github.com/cupy/cupy/blob/v4.0.0rc1/cupy/random/generator.py#L552-L583
This means that there may be two issues:
setStream
call (prior to the library function invocation) may be overwritten by the subsequent call.Here is a quote of Thread Safetiness section from API reference of each library.
cuBLAS
Both issue 1. and 2.
(we're using
cublasSetMathMode
which changes the configuration for the handle)cuDNN
Both issue 1. and 2.
cuFFT (handle = plan)
No issue in CuPy, as we can expect that
Plan1d
(our private API that creates plan) instances are not shared between threads.cuSOLVER
Unclear.
Maybe both issue 1 and 2?
cuSPARSE
Unclear.
No issue in CuPy, as our cuSPARSE invocation currently does not use
with nogil
.cuRAND (handle = generator)
Unclear; no description on API reference.
The text was updated successfully, but these errors were encountered: