Steps to reproduce
- Try to provision a RunPod cluster via
dstack, e.g. a two-pod H100 cluster is currently available
- The provisioning fails as if there is no capacity with an obscure error from RunPod:
WARNING 2026-05-27T10:33:42.249 dstack._internal.server.background.pipeline_tasks.jobs_submitted
job(50b3ad)tame-snake-1-0-0: NVIDIA H100 80GB HBM3 launch in runpod/EUR-IS-3 failed:
RunpodApiClientError([{'message': 'Error creating cluster - cluster creation failed', 'path':
['createCluster'], 'extensions': {'code': 'RUNPOD'}}])
Actual behaviour
If I drop minCudaVersion: "12.8" from the request, then provisioning succeeds:
|
input_fields.append(f'minCudaVersion: "{RunpodProvider.MIN_CUDA_VERSION}"') |
But the host does have an nvidia driver supporting cuda 12.8 so minCudaVersion seems to work incorrectly. Moreover specifying any lower value, e.g. "11", "11.1", all fail. My guess is that minCudaVersion does not work for CreateCluster even though it's listed in the reference.
Introduced in #3304 so RunPod clusters are not working since then.
As a workaround we can drop minCudaVersion from CreateCluster request until it's clarified/fixed on the RunPod side.
Expected behaviour
No response
dstack version
master
Server logs
Additional information
No response
Steps to reproduce
dstack, e.g. a two-pod H100 cluster is currently availableActual behaviour
If I drop
minCudaVersion: "12.8"from the request, then provisioning succeeds:dstack/src/dstack/_internal/core/backends/runpod/api_client.py
Line 703 in d38ae9b
But the host does have an nvidia driver supporting cuda 12.8 so
minCudaVersionseems to work incorrectly. Moreover specifying any lower value, e.g. "11", "11.1", all fail. My guess is thatminCudaVersiondoes not work for CreateCluster even though it's listed in the reference.Introduced in #3304 so RunPod clusters are not working since then.
As a workaround we can drop
minCudaVersionfrom CreateCluster request until it's clarified/fixed on the RunPod side.Expected behaviour
No response
dstack version
master
Server logs
Additional information
No response