Option to capture to GPU memory

Some users may want their screenshots in GPU memory.  This is common for deep learning and other AI use cases, as well as GPU-accelerated video encoding, or redisplaying the contents in part of a separate application.  This is an exciting possibility!

Currently, with MSS, doing AI with the output means that MSS (implicitly) copies the screenshot from the GPU to the CPU, the user code puts that into a PyTorch tensor or similar, and then uploads it right back to the GPU.  This is not just the PCIe time: there’s a lot of overhead in setting up and synchronizing these copies.

The relevant frameworks (such as PyTorch, CuPy, and Numba) support the [`__cuda_array_interface__` API](https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html).  This lets a producer, such as MSS, create an array object whose contents are on the GPU.  Similarly, the [DLPack API](https://dmlc.github.io/dlpack/latest/) allows capturing to other GPU backing stores (such as ROCm); although this is a little more involved, it is the direction that a lot of these frameworks are going for future work.  While OpenCV does not (as far as I could tell) support these data exchange APIs directly, the cv2.cuda.createGpuMatFromCudaMemory method can still be used easily.

While the existing screenshot backends all create their screenshots in CPU memory, a lot of more modern screenshot APIs — such as Linux’s XComposite or Windows’s Desktop Duplication — capture screenshots to GPU memory (as an X Pixmap or a DirectX surface) first.

If we start adding these backends, we can give users the option to let the screenshot stay in GPU-resident memory, rather than always copying it to the CPU.  If the user specified this option, then the returned `ScreenShot` object could implement the `__cuda_array_interface__` or DLPack API instead of the `__array_interface__` API.  This would let frameworks like PyTorch use it directly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Option to capture to GPU memory #422

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Option to capture to GPU memory #422

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions