-
Notifications
You must be signed in to change notification settings - Fork 719
Description
Description
When running the same gpu-enabled image with docker and nerdctl using --gpus all with command nvidia-smi, only nerdctl fails due to libnvidia-ml.so cannot be found
# nerdctl run --rm -it --gpus all env12.com/cuda13.0.1-cudnn9-py3.12-torch2.9.0:251031 nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.nerdctl containers (with --gpus all) only have the real library file; the symlink is missing unless ldconfig is run manually inside the container
# nerdctl run --rm -it --gpus all env12.com/cuda13.0.1-cudnn9-py3.12-torch2.9.0:251031 find / -name libnvidia-ml.so*
/usr/local/cuda-13.0/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
/usr/lib64/libnvidia-ml.so.535.261.03# docker run --rm -it --gpus all env12.com/cuda13.0.1-cudnn9-py3.12-torch2.9.0:251031 find / -name libnvidia-ml.so*
/usr/local/cuda-13.0/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
/usr/lib64/libnvidia-ml.so.1
/usr/lib64/libnvidia-ml.so.535.261.03Docker works because nvidia-container-runtime-hook reads /etc/nvidia-container-runtime/config.toml and passes --ldconfig=@/sbin/ldconfig to nvidia-container-cli configure , while nerdctl calls nvidia-container-cli directly, so it needs an explicit --ldconfig argument for ldconfig to run inside the container and create the symlinks.
Steps to reproduce the issue
Describe the results you received and expected
Expected: Containers started with nerdctl run --gpus all should expose the same NVIDIA libraries and SONAME symlinks as docker run --gpus all, including:
/usr/lib64/libnvidia-ml.so.<full-version>/usr/lib64/libnvidia-ml.so.1(symlink created byldconfig)
Received: docker containers (with --gpus all) have both the real library file and the symlink (libnvidia-ml.so.1). nerdctl containers (with --gpus all) only have the real library file; the symlink is missing unless ldconfig is run manually inside the container.
What version of nerdctl are you using?
nerdctl 2.2.0
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
No response