CUDA inference on Azure's partial GPU

Hi, I am trying to run inference with ONNX runtime on Azure using Standard_NV6ads_A10_v5 VM which is 1/6th of GPU and getting CUDA error 801 - cudaErrorNotSupported when creating InferenceSession. Everything works as expected on VM with full GPU (NCasT4_v3).

Is CUDA inference supported on these partial GPU VMs ? Do I need to install anything specific to enable it ?

#Further information
Error:
/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; SUCCTYPE = cudaError; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; SUCCTYPE = cudaError; std::conditional_t<THRW, void, common::Status> = void] CUDA failure 801: operation not supported ; GPU=0 ; hostname=va-forensics-processor-6cbddfc8cf-275l8 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=282 ; expr=cudaSetDevice(info_.device_id);

#Notes
Ubuntu 24.04/ Docker /C#
ONNX nuget ="Microsoft.ML.OnnxRuntime.Gpu.Linux" Version="1.21.0"

CUDA: 12.6
CUDNN: 9.6.0.74

Dockerfile:
`
ARG UBUNTU_YEAR=24
ARG UBUNTU_MONTH=04
FROM ubuntu:$UBUNTU_YEAR.$UBUNTU_MONTH AS base
ENV CUDA_MAJOR_VERSION=12
ENV CUDA_MINOR_VERSION=6
ENV CUDNN_MAJOR_VERSION=9
ENV TENSORRT_MAJOR_VERSION=10

RUN apt-get update &&
apt-get install -y --no-install-recommends wget &&
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu$UBUNTU_YEAR$UBUNTU_MONTH/x86_64/cuda-keyring_1.1-1_all.deb &&
dpkg -i cuda-keyring_1.1-1_all.deb &&
apt-get update &&
apt-get install -y --no-install-recommends cuda-cudart-$CUDA_MAJOR_VERSION-$CUDA_MINOR_VERSION
cuda-nvrtc-$CUDA_MAJOR_VERSION-$CUDA_MINOR_VERSION
libcublas-$CUDA_MAJOR_VERSION-$CUDA_MINOR_VERSION
libcufft-$CUDA_MAJOR_VERSION-$CUDA_MINOR_VERSION
libcurand-$CUDA_MAJOR_VERSION-$CUDA_MINOR_VERSION
libcudnn$CUDNN_MAJOR_VERSION-cuda-$CUDA_MAJOR_VERSION
libnvinfer-plugin$TENSORRT_MAJOR_VERSION
libnvonnxparsers$TENSORRT_MAJOR_VERSION

ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute
ENV PATH=/usr/local/cuda/bin:$PATH
ENV LD_LIBRARY_PATH=/usr/local/cuda/targets/x86_64-linux/lib:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
ENV CUDA_HOME=/usr/local/cuda
`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA inference on Azure's partial GPU #24039

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA inference on Azure's partial GPU #24039

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions