Avoid sharing handles for the same device between threads #1109

kmaehashi · 2018-04-06T11:06:16Z

Currently, when the user code is multi-threaded AND each thread uses the same GPU device, the same handle is used between threads for library calls. However, currently we are calling library functions using with nogil.

https://github.com/cupy/cupy/blob/v4.0.0rc1/cupy/cuda/device.pyx#L22-L25
https://github.com/cupy/cupy/blob/v4.0.0rc1/cupy/random/generator.py#L552-L583

This means that there may be two issues:

Library function invocations using the same handle may run concurrently.
Stream set via setStream call (prior to the library function invocation) may be overwritten by the subsequent call.

Here is a quote of Thread Safetiness section from API reference of each library.

cuBLAS

The library is thread safe and its functions can be called from multiple host threads, even with the same handle. When multiple threads share the same handle, extreme care needs to be taken when the handle configuration is changed because that change will affect potentially subsequent CUBLAS calls in all threads. It is even more true for the destruction of the handle. So it is not recommended that multiple thread share the same CUBLAS handle.

Both issue 1. and 2.
(we're using cublasSetMathMode which changes the configuration for the handle)

cuDNN

For multithreaded applications that use the same device from different threads, the recommended programming model is to create one (or a few, as is convenient) cuDNN handle(s) per thread and use that cuDNN handle for the entire life of the thread.

The library is thread safe and its functions can be called from multiple host threads, as long as threads to do not share the same cuDNN handle simultaneously.

Both issue 1. and 2.

cuFFT (handle = plan)

cuFFT APIs are thread safe as long as different host threads execute FFTs using different plans and the output data are disjoint.

No issue in CuPy, as we can expect that Plan1d (our private API that creates plan) instances are not shared between threads.

cuSOLVER

The library is thread safe and its functions can be called from multiple host threads.

Unclear.
Maybe both issue 1 and 2?

cuSPARSE

The library is thread safe and its functions can be called from multiple host threads.

Unclear.
No issue in CuPy, as our cuSPARSE invocation currently does not use with nogil.

cuRAND (handle = generator)

Unclear; no description on API reference.

The text was updated successfully, but these errors were encountered:

pentschev · 2019-02-25T09:47:11Z

I can confirm both 1 and 2 apply for cuSOLVER as well, and it is the issue in #2045. One way I see of solving this is to keep two dictionaries, one for handles in use and the other for already allocated but free handles, assuming we don't want to create/release new handles for every operation. Perhaps, we would want a more sophisticated handle manager, to avoid keeping countless of them that never get released.

I can work on such a solution, but I would like some additional input, perhaps there is already some other ongoing discussion?

@mrocklin @anaruse FYI

kmaehashi · 2019-02-26T02:26:10Z

Thanks for the insights and reproduction code!
I started working on this issue in #2053.

kmaehashi added the cat:enhancement Improvements to existing features label Apr 6, 2018

not522 mentioned this issue Oct 8, 2018

Need in-place and planned C2C & Z2Z FFT for better performance #1669

Closed

pentschev mentioned this issue Feb 20, 2019

Multi-threaded qr() causes CUDA illegal memory access #2045

Closed

kmaehashi mentioned this issue Feb 26, 2019

Avoid sharing handles between threads #2053

Merged

kmaehashi self-assigned this Feb 26, 2019

kmaehashi added cat:bug Bugs and removed cat:enhancement Improvements to existing features labels Feb 26, 2019

okuta closed this as completed in #2053 Apr 1, 2019

kmaehashi mentioned this issue Dec 4, 2019

Support cuTENSOR 1.0.0 #2709

Merged

maleadt mentioned this issue Mar 10, 2020

Use a library handle for each thread/device. JuliaGPU/CuArrays.jl#623

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid sharing handles for the same device between threads #1109

Avoid sharing handles for the same device between threads #1109

kmaehashi commented Apr 6, 2018 •

edited

Loading

pentschev commented Feb 25, 2019 •

edited

Loading

kmaehashi commented Feb 26, 2019

Avoid sharing handles for the same device between threads #1109

Avoid sharing handles for the same device between threads #1109

Comments

kmaehashi commented Apr 6, 2018 • edited Loading

cuBLAS

cuDNN

cuFFT (handle = plan)

cuSOLVER

cuSPARSE

cuRAND (handle = generator)

pentschev commented Feb 25, 2019 • edited Loading

kmaehashi commented Feb 26, 2019

kmaehashi commented Apr 6, 2018 •

edited

Loading

pentschev commented Feb 25, 2019 •

edited

Loading