Providing an option to use a non-default stream #209

michaeleisel · 2024-03-08T01:01:42Z

I'm trying to improve the performance of my candle models via CUDA streams, which I've benchmarked to be helpful. However, I want to be sure that this is safe to do, at least with the restrictions I've imposed on myself (cudarc Device and candle Tensor objects may only be accessed from the thread in which they were created, and candle Tensors may only perform operations with other Tensors of the same Device instance). The problem is that cudarc always uses the default legacy stream. I'd love to instead have an option to use the old behavior of creating a new stream for each new device, if it'd be safe to do so.

coreylowman · 2024-03-13T17:03:10Z

This should already be possible via CudaStream. Though you have to manage the additional stream yourself instead of cudarc managing it for you.

michaeleisel · 2024-03-13T17:10:10Z

If I want to run, say, CudaDevice::memset_zeros(), I don't see any way to do this without using the default legacy stream to do it, because I don't see any options to customize the stream.

coreylowman · 2024-03-13T17:25:29Z

Ah gotcha, you're right. Would you want an option on creation of the CudaDevice to use a non-null stream?

michaeleisel · 2024-03-13T18:30:41Z

I think that could be an interesting way to do it, yeah. My one concern would be if there are any APIs, like candle, that assume that if two devices have the same ordinal, then their operations are on the same stream. But, maybe that would be considered overreliance on implementation details. For candle, it looks like they use a stricter form of checking with a generated unique ID rather than the ordinal, so there wouldn't be any problems with that approach for that library.

michaeleisel mentioned this issue Mar 15, 2024

Support for CUDA Streams huggingface/candle#1751

Open

coreylowman added a commit that referenced this issue May 1, 2024

#209 Add CudaDevice::new_with_stream

82813cc

coreylowman mentioned this issue May 1, 2024

Add CudaDevice::new_with_stream #224

Merged

coreylowman added a commit that referenced this issue May 1, 2024

#209 Add CudaDevice::new_with_stream (#224)

919fe10

coreylowman closed this as completed in #224 May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Providing an option to use a non-default stream #209

Providing an option to use a non-default stream #209

michaeleisel commented Mar 8, 2024

coreylowman commented Mar 13, 2024

michaeleisel commented Mar 13, 2024

coreylowman commented Mar 13, 2024

michaeleisel commented Mar 13, 2024

Providing an option to use a non-default stream #209

Providing an option to use a non-default stream #209

Comments

michaeleisel commented Mar 8, 2024

coreylowman commented Mar 13, 2024

michaeleisel commented Mar 13, 2024

coreylowman commented Mar 13, 2024

michaeleisel commented Mar 13, 2024