-
Notifications
You must be signed in to change notification settings - Fork 45
Description
Summary
The flashvsr pipeline reports "No CUDA GPUs are available" at load time on fal.ai GPU-H100 workers where CUDA is otherwise functioning normally. The error recurs across multiple load attempts within the same session, preventing the pipeline from becoming usable.
Error Details
Observed in fal.ai prod logs, 2026-03-12 21:16–21:20 UTC:
scope.server.pipeline_manager - ERROR - Failed to load pipeline flashvsr:
No CUDA GPUs are available. If this error persists, consider removing the
models directory '/data/models' and re-downloading models.
scope.server.pipeline_manager - ERROR - Failed to load pipeline: flashvsr
scope.server.pipeline_manager - ERROR - Some pipelines failed to load
Multiple attempts (at least 3 within the same session, ~4 minutes apart):
-
2026-03-12 21:16:17 UTC
-
2026-03-12 21:20:09 UTC
-
2026-03-12 21:20:33 UTC
-
fal.ai job:
e416f15d-e5f9-4c66-8422-57623b915c94 -
fal.ai node:
1dfe12b6-1fe9-bed3-17dd-e2ae8651c9fc(worker type:fal-jobs/GPU-H100)
Why This Is Suspicious
The same session/job is actively processing frames (NDI output errors, spout errors, other pipelines loading) — so the GPU worker is alive and other operations run fine. Only flashvsr raises "No CUDA GPUs are available", suggesting the exception originates in the flashvsr plugin's own initialization rather than a system-wide CUDA absence.
Possible causes:
- The flashvsr plugin calls
torch.cuda.is_available()or checksdevice_count()in a way that fails on this particular driver/environment configuration - The plugin checks for a specific CUDA capability not present on this H100 node variant
- A
CUDA_VISIBLE_DEVICESenvironment variable is set to an unexpected value at plugin load time - Race condition: plugin loads before CUDA context is fully initialized
Distinction from #574
Issue #574 reports "Invalid pipeline ID: flashvsr" (pipeline registry/model lookup failure). This issue is a different error message ("No CUDA GPUs are available") on a GPU-equipped worker where the ID itself appears to resolve correctly.
Impact
Users who select the flashvsr pipeline on fal.ai cannot use it; the failure is silent from the UI perspective ("Some pipelines failed to load").