BETA CUDA interface: Fix CUDA context initialization #946

NicolasHug · 2025-10-08T15:04:02Z

On main, this fails:

from torchcodec.decoders import VideoDecoder
from joblib import Parallel, delayed

video_path = "/home/nicolashug/videos_h264/vid.mp4"

import torch

def decode_one_video():
    decoder = VideoDecoder(video_path, device="cuda:0:beta", seek_mode="approximate")
    decoder.get_frame_at(-1)

Parallel(n_jobs=8, prefer="threads")(delayed(decode_one_video)() for _ in range(100))

with

RuntimeError: Failed to get decoder caps: 201

That is, spawning one VideoDecoder per thread fails with CUDA error 201 which is CUDA_INVALID_CONTEXT.

Uh?

I'm confused as well. This seems to indicate that our CUDA context initialization hack, where we create a dummy tensor to force context creation, doesn't work as expected:

torchcodec/src/torchcodec/_core/BetaCudaDeviceInterface.cpp

Lines 206 to 208 in 9c5da20

    
           // Initialize CUDA context with a dummy tensor 
        
           torch::Tensor dummyTensorForCudaInitialization = torch::empty( 
        
               {1}, torch::TensorOptions().dtype(torch::kUInt8).device(device_));

After a lot of trial and error, it seems that using torch::zeros instead of torch::empty resolves the problem. Why? I have no idea. Maybe torch::empty was optimized out? Maybe, but that doesn't explain why the default CUDA interface works fine with teh snippet above... Anyway, now they both use torch::zeros, and they both work when running my multithreaded benchmarks.

BETA CUDA interface: Fix CUDA context initialization

d211fff

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 8, 2025

mollyxu approved these changes Oct 8, 2025

View reviewed changes

Dan-Flores approved these changes Oct 8, 2025

View reviewed changes

NicolasHug merged commit 986f10c into meta-pytorch:main Oct 9, 2025
50 checks passed

NicolasHug deleted the fix-cuda-context-init branch October 9, 2025 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BETA CUDA interface: Fix CUDA context initialization #946

BETA CUDA interface: Fix CUDA context initialization #946

Uh oh!

NicolasHug commented Oct 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	// Initialize CUDA context with a dummy tensor
	torch::Tensor dummyTensorForCudaInitialization = torch::empty(
	{1}, torch::TensorOptions().dtype(torch::kUInt8).device(device_));

BETA CUDA interface: Fix CUDA context initialization #946

BETA CUDA interface: Fix CUDA context initialization #946

Uh oh!

Conversation

NicolasHug commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NicolasHug commented Oct 8, 2025 •

edited

Loading