Suggestion for the CUDA stream module #11606

nglee · 2018-05-28T15:20:22Z

Hi, I'd like to suggest something related to the CUDA stream module.

It seems that cv::cuda::Stream class encapsulates a feature related to CUDA memory allocation using StackAllocator class and MemoryPool class.

This feature is described in detail at this documentation for BufferPool. In short, it seems that this feature is designed to bypass CUDA memory allocation API calls to speed up performance as follows:

When a Stream class is constructed, some amount of CUDA memory is pre-allocated and assigned to the Stream instance.
If an OpenCV algorithm is run on that stream, the algorithm internally uses the BufferPool class to get buffer memory (if needed) from the pre-allocted area, rather than calling the CUDA memory allocation API.
This reduces overhead.

Since there are many cases where CUDA algorithms need some amount of GPU memory (sometimes as small as a few bytes) for internal buffer, it seems reasonable to take this memory pre-allocation approach. I suppose that's the original reason behind this design, and this feature was enabled by default before #10751. It is now disabled by default by the mentioned PR, while users can still turn it on by cv::cuda::setBufferPoolUsage(true);.

However, in my opinion, this feature should not be used since enabling it may lead to some problems.

The problems are:

The current design does not work well with device resetting function. The allocated CUDA memory is deallocated when the CUDA context is reset with cv::cuda::resetDevice(), but this is not recognized by the Stream module, leading to failures. The following code snippets fail to run due to this.
- https://github.com/nglee/opencv_test/blob/e6e7aab4202d285965d62dd79ff66e410fecc85c/cuda_stream_master/cuda_stream_master.cpp#L11-L28
- https://github.com/nglee/opencv_test/blob/e6e7aab4202d285965d62dd79ff66e410fecc85c/cuda_stream_master/cuda_stream_master.cpp#L30-L48
It is error-prone since users would have to consider the deallocation order. The following code will show different images for seemingly unchanged variable.
- https://github.com/nglee/opencv_test/blob/e6e7aab4202d285965d62dd79ff66e410fecc85c/cuda_stream_master/cuda_stream_master.cpp#L50-L77
The Stream module becomes not safe to multi-threaded application.
- The thread-non-safety of Stream is already mentioned in the documentation: https://docs.opencv.org/master/d9/df3/classcv_1_1cuda_1_1Stream.html
- It would be tempting for users to use the default stream for multi-threaded application, but running the following code snippet displays thread-non-safety when using the default stream. Some thread prints different result.
  https://github.com/nglee/opencv_test/blob/e6e7aab4202d285965d62dd79ff66e410fecc85c/cuda_stream_master/cuda_stream_master.cpp#L81-L107

All code snippets mentioned above run without error when memory pre-allocation mechanism is disabled by replacing setBufferPoolUsage(true); to setBufferPoolUsage(false); (or just by deleting the setBufferPoolUsage(true); line).

So I'm suggesting that the memory pre-allocation feature of Stream module should be blocked. But to achieve this, the StackAllocator, MemoryPool, and BufferPool class should be blocked or removed. Also, existing CUDA memory allocation using BufferPool should be replaced by ordinary GpuMat allocation using the DefaultAllocator.

Since all these changes seem too radical, another option I can think of is that we might stay the way as it is right now (disable the feature by default) and warn the users who would like to enable this feature by rewriting the documentation of setBufferPoolUsage function.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion for the CUDA stream module #11606

Suggestion for the CUDA stream module #11606

nglee commented May 28, 2018 •

edited

Suggestion for the CUDA stream module #11606

Suggestion for the CUDA stream module #11606

Comments

nglee commented May 28, 2018 • edited

nglee commented May 28, 2018 •

edited