nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1 #154

Dan-Burns · 2022-12-02T15:27:59Z

Hello,

I tried the different combinations of conda and pip packages that people suggest to get tensorflow running for the rtx 30 series. Thought it was working after utilizing the gpu with keras tutorial code but moved to a different type of model and something apparently broke.

Now I'm trying the docker route.
docker run --gpus all -it --rm nvcr.io/nvidia/tensorflow:22.11-tf2-py3
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
There seems to be a lot of missing libraries.

3. Information to attach (optional if deemed irrelevant)

[ x] Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
I1202 15:15:34.407243 26518 nvc.c:376] initializing library context (version=1.11.0, build=)
I1202 15:15:34.407353 26518 nvc.c:350] using root /
I1202 15:15:34.407365 26518 nvc.c:351] using ldcache /etc/ld.so.cache
I1202 15:15:34.407377 26518 nvc.c:352] using unprivileged user 1000:1000
I1202 15:15:34.407426 26518 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I1202 15:15:34.408137 26518 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W1202 15:15:34.411623 26519 nvc.c:273] failed to set inheritable capabilities
W1202 15:15:34.411736 26519 nvc.c:274] skipping kernel modules load due to failure
I1202 15:15:34.412602 26520 rpc.c:71] starting driver rpc service
I1202 15:15:34.433974 26521 rpc.c:71] starting nvcgo rpc service
I1202 15:15:34.438005 26518 nvc_info.c:766] requesting driver information with ''
I1202 15:15:34.445181 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.520.56.06
I1202 15:15:34.445313 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.520.56.06
I1202 15:15:34.445952 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.520.56.06
I1202 15:15:34.446254 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.520.56.06
I1202 15:15:34.446554 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.520.56.06
I1202 15:15:34.446877 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.520.56.06
I1202 15:15:34.447241 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.520.56.06
I1202 15:15:34.447301 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.520.56.06
I1202 15:15:34.447405 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.520.56.06
I1202 15:15:34.447490 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.520.56.06
I1202 15:15:34.447550 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.520.56.06
I1202 15:15:34.447813 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.520.56.06
I1202 15:15:34.448099 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.520.56.06
I1202 15:15:34.448197 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.520.56.06
I1202 15:15:34.448693 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.520.56.06
I1202 15:15:34.448755 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.520.56.06
I1202 15:15:34.449075 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.520.56.06
I1202 15:15:34.449417 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.520.56.06
I1202 15:15:34.450211 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcudadebugger.so.520.56.06
I1202 15:15:34.450273 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.520.56.06
I1202 15:15:34.450625 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.520.56.06
I1202 15:15:34.450896 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.520.56.06
I1202 15:15:34.451174 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.520.56.06
I1202 15:15:34.451236 26518 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.520.56.06
I1202 15:15:34.451580 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.520.56.06
I1202 15:15:34.451929 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.520.56.06
I1202 15:15:34.452169 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.520.56.06
I1202 15:15:34.452413 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.520.56.06
I1202 15:15:34.452680 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.520.56.06
I1202 15:15:34.452975 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.520.56.06
I1202 15:15:34.453288 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.520.56.06
I1202 15:15:34.453571 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.520.56.06
I1202 15:15:34.453833 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.520.56.06
I1202 15:15:34.454141 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.520.56.06
I1202 15:15:34.454359 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.520.56.06
I1202 15:15:34.455059 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.520.56.06
I1202 15:15:34.455764 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-allocator.so.520.56.06
I1202 15:15:34.456075 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.520.56.06
I1202 15:15:34.456395 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libcuda.so.520.56.06
I1202 15:15:34.456750 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.520.56.06
I1202 15:15:34.457050 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.520.56.06
I1202 15:15:34.457314 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.520.56.06
I1202 15:15:34.457580 26518 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.520.56.06
W1202 15:15:34.457645 26518 nvc_info.c:399] missing library libnvidia-nscq.so
W1202 15:15:34.457659 26518 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so
W1202 15:15:34.457678 26518 nvc_info.c:399] missing library libnvidia-pkcs11.so
W1202 15:15:34.457694 26518 nvc_info.c:399] missing library libvdpau_nvidia.so
W1202 15:15:34.457709 26518 nvc_info.c:399] missing library libnvidia-ifr.so
W1202 15:15:34.457722 26518 nvc_info.c:399] missing library libnvidia-cbl.so
W1202 15:15:34.457740 26518 nvc_info.c:403] missing compat32 library libnvidia-cfg.so
W1202 15:15:34.457753 26518 nvc_info.c:403] missing compat32 library libnvidia-nscq.so
W1202 15:15:34.457768 26518 nvc_info.c:403] missing compat32 library libcudadebugger.so
W1202 15:15:34.457780 26518 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so
W1202 15:15:34.457792 26518 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so
W1202 15:15:34.457808 26518 nvc_info.c:403] missing compat32 library libnvidia-ngx.so
W1202 15:15:34.457828 26518 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so
W1202 15:15:34.457843 26518 nvc_info.c:403] missing compat32 library libnvidia-ifr.so
W1202 15:15:34.457860 26518 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so
W1202 15:15:34.457880 26518 nvc_info.c:403] missing compat32 library libnvoptix.so
W1202 15:15:34.457894 26518 nvc_info.c:403] missing compat32 library libnvidia-cbl.so
I1202 15:15:34.460121 26518 nvc_info.c:299] selecting /usr/bin/nvidia-smi
I1202 15:15:34.460197 26518 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump
I1202 15:15:34.460243 26518 nvc_info.c:299] selecting /usr/bin/nvidia-persistenced
I1202 15:15:34.460336 26518 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-control
I1202 15:15:34.460409 26518 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-server
W1202 15:15:34.460616 26518 nvc_info.c:425] missing binary nv-fabricmanager
I1202 15:15:34.460810 26518 nvc_info.c:343] listing firmware path /usr/lib/firmware/nvidia/520.56.06/gsp.bin
I1202 15:15:34.460876 26518 nvc_info.c:529] listing device /dev/nvidiactl
I1202 15:15:34.460891 26518 nvc_info.c:529] listing device /dev/nvidia-uvm
I1202 15:15:34.460904 26518 nvc_info.c:529] listing device /dev/nvidia-uvm-tools
I1202 15:15:34.460915 26518 nvc_info.c:529] listing device /dev/nvidia-modeset
I1202 15:15:34.460980 26518 nvc_info.c:343] listing ipc path /run/nvidia-persistenced/socket
W1202 15:15:34.461036 26518 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket
W1202 15:15:34.461083 26518 nvc_info.c:349] missing ipc path /tmp/nvidia-mps
I1202 15:15:34.461100 26518 nvc_info.c:822] requesting device information with ''
I1202 15:15:34.468056 26518 nvc_info.c:713] listing device /dev/nvidia0 (GPU-ba9fdcdb-8a2b-d2b6-f69c-5f2ac08dde8b at 00000000:01:00.0)
NVRM version: 520.56.06
CUDA version: 11.8

Device Index: 0
Device Minor: 0
Model: NVIDIA GeForce RTX 3090 Ti
Brand: GeForce
GPU UUID: GPU-ba9fdcdb-8a2b-d2b6-f69c-5f2ac08dde8b
Bus Location: 00000000:01:00.0
Architecture: 8.6
I1202 15:15:34.468151 26518 nvc.c:434] shutting down library context
I1202 15:15:34.468317 26521 rpc.c:95] terminating nvcgo rpc service
I1202 15:15:34.469397 26518 rpc.c:132] nvcgo rpc service terminated successfully
I1202 15:15:34.474156 26520 rpc.c:95] terminating driver rpc service
I1202 15:15:34.474599 26518 rpc.c:132] driver rpc service terminated successfully

[ x] Kernel version from uname -a
5.15.0-53-generic Where is nvcc inside nvidia/caffe? nvidia-docker#59-Ubuntu SMP Mon Oct 17 18:53:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[ x] Driver information from nvidia-smi -a
==============NVSMI LOG==============

Timestamp : Fri Dec 2 09:17:13 2022
Driver Version : 520.56.06
CUDA Version : 11.8

Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : NVIDIA GeForce RTX 3090 Ti
Product Brand : GeForce
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Enabled
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-ba9fdcdb-8a2b-d2b6-f69c-5f2ac08dde8b
Minor Number : 0
VBIOS Version : 94.02.A0.00.2D
MultiGPU Board : No
Board ID : 0x100
GPU Part Number : N/A
Module ID : 0
Inforom Version
Image Version : G002.0000.00.03
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x220310DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x88701043
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 1000 KB/s
Rx Throughput : 0 KB/s
Fan Speed : 0 %
Performance State : P8
Clocks Throttle Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 24564 MiB
Reserved : 310 MiB
Used : 510 MiB
Free : 23742 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 13 MiB
Free : 243 MiB
Compute Mode : Default
Utilization
Gpu : 6 %
Memory : 5 %
Encoder : 0 %
Decoder : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 36 C
GPU Shutdown Temp : 97 C
GPU Slowdown Temp : 94 C
GPU Max Operating Temp : 92 C
GPU Target Temperature : 83 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : Supported
Power Draw : 32.45 W
Power Limit : 480.00 W
Default Power Limit : 480.00 W
Enforced Power Limit : 480.00 W
Min Power Limit : 100.00 W
Max Power Limit : 516.00 W
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : 2115 MHz
SM : 2115 MHz
Memory : 10501 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 740.000 mV
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 2283
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 259 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 2441
Type : G
Name : /usr/bin/gnome-shell
Used GPU Memory : 52 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 3320
Type : G
Name : /opt/docker-desktop/Docker Desktop --type=gpu-process --enable-crashpad --enable-crash-reporter=46721d59-e3cc-4241-8f96-57bab71f8674,no_channel --user-data-dir=/home/kanaka/.config/Docker Desktop --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAAAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,777493636119283380,17735576311253417080,131072 --disable-features=SpareRendererForSitePerProcess
Used GPU Memory : 27 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 4402
Type : C+G
Name : /opt/google/chrome/chrome --type=gpu-process --enable-crashpad --crashpad-handler-pid=4367 --enable-crash-reporter=, --change-stack-guard-on-fork=enable --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAEAAAA4AAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,1352372760819385498,10632265477078674372,131072
Used GPU Memory : 166 MiB

[ x] Docker version from docker version
Client: Docker Engine - Community
Cloud integration: v1.0.29
Version: 20.10.21
API version: 1.41
Go version: go1.18.7
Git commit: baeda1f
Built: Tue Oct 25 18:01:58 2022
OS/Arch: linux/amd64
Context: desktop-linux
Experimental: true

Server: Docker Desktop 4.15.0 (93002)
Engine:
Version: 20.10.21
API version: 1.41 (minimum version 1.12)
Go version: go1.18.7
Git commit: 3056208
Built: Tue Oct 25 18:00:19 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.10
GitCommit: 770bd0108c32f3fb5c73ae1264f7e503fe7b2661
runc:
Version: 1.1.4
GitCommit: v1.1.4-0-g5fd4c4d
docker-init:
Version: 0.19.0
GitCommit: de40ad0

[x ] NVIDIA packages -Desired=Unknown/Install/Re | Status=Not/Inst/Conf-file |/ Err?=(none)/Reinst-required ||/ Name +++-======================= un libgldispatch0-nvidia ii libnvidia-cfg1-515:amd64 ii libnvidia-cfg1-520:amd64 un libnvidia-cfg1-any un libnvidia-common ii libnvidia-common-515 ii libnvidia-common-520 un libnvidia-compute ii libnvidia-compute-515:amd64 ii libnvidia-compute-515:i386 ii libnvidia-compute-520:amd64 ii libnvidia-compute-520:i386 ii libnvidia-container-tools ii libnvidia-container1:amd64 un libnvidia-decode ii libnvidia-decode-515:amd64 ii libnvidia-decode-515:i386 ii libnvidia-decode-520:amd64 ii libnvidia-decode-520:i386 ii libnvidia-egl-wayland1:amd64 un libnvidia-encode ii libnvidia-encode-515:amd64 ii libnvidia-encode-515:i386 ii libnvidia-encode-520:amd64 ii libnvidia-encode-520:i386 un libnvidia-encode1 un libnvidia-extra ii libnvidia-extra-515:amd64 ii libnvidia-extra-520:amd64 ii libnvidia-extra-520:i386 un libnvidia-fbc1 ii libnvidia-fbc1-515:amd64 ii libnvidia-fbc1-515:i386 ii libnvidia-fbc1-520:amd64 ii libnvidia-fbc1-520:i386 un libnvidia-gl un libnvidia-gl-390 un libnvidia-gl-410 un libnvidia-gl-470 un libnvidia-gl-495 ii libnvidia-gl-515:amd64 ii libnvidia-gl-515:i386 ii libnvidia-gl-520:amd64 ii libnvidia-gl-520:i386 un libnvidia-legacy-390xx-egl-wayland1 un libnvidia-ml1 un nvidia-common un nvidia-compute-utils ii nvidia-compute-utils-515 ii nvidia-compute-utils-520 un nvidia-contaienr-runtime un nvidia-container-runtime un nvidia-container-runtime-hook ii nvidia-container-toolkit ii nvidia-container-toolkit-base ii nvidia-dkms-515 ii nvidia-dkms-520 un nvidia-dkms-kernel un nvidia-driver ii nvidia-driver-515 ii nvidia-driver-520 un nvidia-driver-binary un nvidia-egl-wayland-common un nvidia-kernel-common ii nvidia-kernel-common-515 ii nvidia-kernel-common-520 un nvidia-kernel-source ii nvidia-kernel-source-515 ii nvidia-kernel-source-520 un nvidia-libopencl1-dev un nvidia-opencl-icd un nvidia-persistenced ii nvidia-prime ii nvidia-settings un nvidia-settings-binary un nvidia-smi un nvidia-utils ii nvidia-utils-515 ii nvidia-utils-520 ii xserver-xorg-video-nvidia-515 ii xserver-xorg-video-nvidia-520 version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
move/Purge/Hold
s/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
(Status,Err: uppercase=bad)
Version Architecture Description
============-============================-============-========================================================================
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-cfg1-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA binary OpenGL/GLX configuration library
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 all Transitional package for libnvidia-common-520
520.56.06-0lambda0.22.04.3 all Shared files used by the NVIDIA libraries
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-compute-520
520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-compute-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA libcompute package
520.56.06-0lambda0.22.04.3 i386 NVIDIA libcompute package
1.11.0+dfsg-0lambda0.22.04.1 amd64 Package for configuring containers with NVIDIA hardware (CLI tool)
1.11.0+dfsg-0lambda0.22.04.1 amd64 Package for configuring containers with NVIDIA hardware (shared library)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-decode-520
520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-decode-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA Video Decoding runtime libraries
520.56.06-0lambda0.22.04.3 i386 NVIDIA Video Decoding runtime libraries
1:1.1.9-1.1 amd64 Wayland EGL External Platform library -- shared library
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-encode-520
520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-encode-520
520.56.06-0lambda0.22.04.3 amd64 NVENC Video Encoding runtime library
520.56.06-0lambda0.22.04.3 i386 NVENC Video Encoding runtime library
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-extra-520
520.56.06-0lambda0.22.04.3 amd64 Extra libraries for the NVIDIA driver
520.56.06-0lambda0.22.04.3 i386 Extra libraries for the NVIDIA driver
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-fbc1-520
520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-fbc1-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
520.56.06-0lambda0.22.04.3 i386 NVIDIA OpenGL-based Framebuffer Capture runtime library
(no description available)
(no description available)
(no description available)
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for libnvidia-gl-520
520.56.06-0lambda0.22.04.3 i386 Transitional package for libnvidia-gl-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
520.56.06-0lambda0.22.04.3 i386 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
(no description available)
(no description available)
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-compute-utils-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA compute utilities
(no description available)
(no description available)
(no description available)
1.11.0-0lambda0.22.04.1 amd64 OCI hook for configuring containers for NVIDIA hardware
1.11.0-0lambda0.22.04.1 amd64 OCI hook for configuring containers for NVIDIA hardware
520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-dkms-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA DKMS package
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-driver-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA driver metapackage
(no description available)
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-kernel-common-520
520.56.06-0lambda0.22.04.3 amd64 Shared files used with the kernel module
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-kernel-source-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA kernel source package
(no description available)
(no description available)
(no description available)
0.8.17.1 all Tools to enable NVIDIA's Prime
510.47.03-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
(no description available)
(no description available)
(no description available)
520.56.06-0lambda0.22.04.3 amd64 Transitional package for nvidia-utils-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA driver support binaries
520.56.06-0lambda0.22.04.3 amd64 Transitional package for xserver-xorg-video-nvidia-520
520.56.06-0lambda0.22.04.3 amd64 NVIDIA binary Xorg driver
[ x] NVIDIA container library version from nvidia-container-cli -V
cli-version: 1.11.0
lib-version: 1.11.0
build date: 2022-10-25T22:10+00:00
build revision:
build compiler: x86_64-linux-gnu-gcc-11 11.3.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -g -O2 -ffile-prefix-map=/build/libnvidia-container-956QFy/libnvidia-container-1.11.0+dfsg=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro

The text was updated successfully, but these errors were encountered:

elezar · 2022-12-02T15:43:50Z

The toolkit explicitly looks for libnvidia-ml.so.1 which should be symlinked to libnvidia-mk.so.<DRIVER_VERSION> after running ldconfig on your host. Since nvidia-smi works (and also uses libnvidia-ml.so.1), I would not expect this to be the case.

How is docker installed, could it be that it is installed as a snap and cannot load the system libraries because of this?

Dan-Burns · 2022-12-02T15:47:02Z

I installed docker-desktop after following the "docker engine" link on https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow

johndpope · 2022-12-05T07:59:31Z

same problem ubuntu 22:04

Linux msi 5.15.0-56-generic NVIDIA/nvidia-docker#62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

docker desktop

can you unpack this?

The toolkit explicitly looks for libnvidia-ml.so.1 which should be symlinked to libnvidia-mk.so.<DRIVER_VERSION> after running ldconfig on your host. Since nvidia-smi works (and also uses libnvidia-ml.so.1), I would not expect this to be the case.

How is docker installed, could it be that it is installed as a snap and cannot load the system libraries because of this?

I installed

sudo apt-get install -y nvidia-docker2

successfully
nvidia-docker2 is already the newest version (2.11.0-1).

Mon Dec  5 18:59:03 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11    Driver Version: 525.60.11    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   57C    P8    29W / 370W |   1010MiB / 24576MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1515      G   /usr/lib/xorg/Xorg                548MiB |
|    0   N/A  N/A      1649      G   /usr/bin/gnome-shell              234MiB |
|    0   N/A  N/A     19695      G   ...RendererForSitePerProcess       32MiB |
|    0   N/A  N/A     19769    C+G   ...192290595412440874,131072      191MiB |
+-----------------------------------------------------------------------------+




nvidia-container-cli -k -d /dev/tty info

-- WARNING, the following logs are for debugging purposes only --

I1205 08:00:00.132727 24945 nvc.c:376] initializing library context (version=1.11.0, build=c8f267be0bac1c654d59ad4ea5df907141149977)
I1205 08:00:00.132797 24945 nvc.c:350] using root /
I1205 08:00:00.132806 24945 nvc.c:351] using ldcache /etc/ld.so.cache
I1205 08:00:00.132819 24945 nvc.c:352] using unprivileged user 29999:29999
I1205 08:00:00.132844 24945 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I1205 08:00:00.133009 24945 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W1205 08:00:00.134346 24946 nvc.c:273] failed to set inheritable capabilities
W1205 08:00:00.134424 24946 nvc.c:274] skipping kernel modules load due to failure
I1205 08:00:00.134891 24947 rpc.c:71] starting driver rpc service
I1205 08:00:00.142782 24948 rpc.c:71] starting nvcgo rpc service
I1205 08:00:00.143811 24945 nvc_info.c:766] requesting driver information with ''
I1205 08:00:00.145644 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.525.60.11
I1205 08:00:00.145731 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.525.60.11
I1205 08:00:00.145778 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.525.60.11
I1205 08:00:00.145821 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.525.60.11
I1205 08:00:00.145877 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.525.60.11
I1205 08:00:00.145930 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.525.60.11
I1205 08:00:00.145970 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.525.60.11
I1205 08:00:00.146007 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.525.60.11
I1205 08:00:00.146066 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.525.60.11
I1205 08:00:00.146105 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.525.60.11
I1205 08:00:00.146144 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.525.60.11
I1205 08:00:00.146183 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.525.60.11
I1205 08:00:00.146236 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.525.60.11
I1205 08:00:00.146288 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.525.60.11
I1205 08:00:00.146325 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.525.60.11
I1205 08:00:00.146366 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.525.60.11
I1205 08:00:00.146418 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.525.60.11
I1205 08:00:00.146475 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.525.60.11
I1205 08:00:00.146752 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcudadebugger.so.525.60.11
I1205 08:00:00.146788 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.525.60.11
I1205 08:00:00.146943 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.525.60.11
I1205 08:00:00.146977 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.525.60.11
I1205 08:00:00.147011 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.525.60.11
I1205 08:00:00.147046 24945 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.525.60.11
I1205 08:00:00.147106 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.525.60.11
I1205 08:00:00.147140 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.525.60.11
I1205 08:00:00.147186 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.525.60.11
I1205 08:00:00.147236 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.525.60.11
I1205 08:00:00.147271 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.525.60.11
I1205 08:00:00.147319 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.525.60.11
I1205 08:00:00.147350 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.525.60.11
I1205 08:00:00.147385 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.525.60.11
I1205 08:00:00.147417 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.525.60.11
I1205 08:00:00.147465 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.525.60.11
I1205 08:00:00.147515 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.525.60.11
I1205 08:00:00.147547 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.525.60.11
I1205 08:00:00.147582 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.525.60.11
I1205 08:00:00.147649 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libcuda.so.525.60.11
I1205 08:00:00.147707 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.525.60.11
I1205 08:00:00.147741 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.525.60.11
I1205 08:00:00.147775 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.525.60.11
I1205 08:00:00.147811 24945 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.525.60.11
W1205 08:00:00.147830 24945 nvc_info.c:399] missing library libnvidia-nscq.so
W1205 08:00:00.147836 24945 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so
W1205 08:00:00.147842 24945 nvc_info.c:399] missing library libnvidia-pkcs11.so
W1205 08:00:00.147847 24945 nvc_info.c:399] missing library libvdpau_nvidia.so
W1205 08:00:00.147854 24945 nvc_info.c:399] missing library libnvidia-ifr.so
W1205 08:00:00.147859 24945 nvc_info.c:399] missing library libnvidia-cbl.so
W1205 08:00:00.147867 24945 nvc_info.c:403] missing compat32 library libnvidia-cfg.so
W1205 08:00:00.147873 24945 nvc_info.c:403] missing compat32 library libnvidia-nscq.so
W1205 08:00:00.147878 24945 nvc_info.c:403] missing compat32 library libcudadebugger.so
W1205 08:00:00.147887 24945 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so
W1205 08:00:00.147893 24945 nvc_info.c:403] missing compat32 library libnvidia-allocator.so
W1205 08:00:00.147899 24945 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so
W1205 08:00:00.147904 24945 nvc_info.c:403] missing compat32 library libnvidia-ngx.so
W1205 08:00:00.147910 24945 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so
W1205 08:00:00.147916 24945 nvc_info.c:403] missing compat32 library libnvidia-ifr.so
W1205 08:00:00.147921 24945 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so
W1205 08:00:00.147926 24945 nvc_info.c:403] missing compat32 library libnvoptix.so
W1205 08:00:00.147932 24945 nvc_info.c:403] missing compat32 library libnvidia-cbl.so
I1205 08:00:00.148532 24945 nvc_info.c:299] selecting /usr/bin/nvidia-smi
I1205 08:00:00.148551 24945 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump
I1205 08:00:00.148569 24945 nvc_info.c:299] selecting /usr/bin/nvidia-persistenced
I1205 08:00:00.148598 24945 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-control
I1205 08:00:00.148615 24945 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-server
W1205 08:00:00.148707 24945 nvc_info.c:425] missing binary nv-fabricmanager
W1205 08:00:00.148735 24945 nvc_info.c:349] missing firmware path /lib/firmware/nvidia/525.60.11/gsp.bin
I1205 08:00:00.148762 24945 nvc_info.c:529] listing device /dev/nvidiactl
I1205 08:00:00.148767 24945 nvc_info.c:529] listing device /dev/nvidia-uvm
I1205 08:00:00.148775 24945 nvc_info.c:529] listing device /dev/nvidia-uvm-tools
I1205 08:00:00.148781 24945 nvc_info.c:529] listing device /dev/nvidia-modeset
I1205 08:00:00.148809 24945 nvc_info.c:343] listing ipc path /run/nvidia-persistenced/socket
W1205 08:00:00.148831 24945 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket
W1205 08:00:00.148847 24945 nvc_info.c:349] missing ipc path /tmp/nvidia-mps
I1205 08:00:00.148851 24945 nvc_info.c:822] requesting device information with ''
I1205 08:00:00.155221 24945 nvc_info.c:713] listing device /dev/nvidia0 (GPU-94c5d11e-e574-eefc-2db6-08e204f9e1a4 at 00000000:01:00.0)
NVRM version:   525.60.11
CUDA version:   12.0

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 3090
Brand:          GeForce
GPU UUID:       GPU-94c5d11e-e574-eefc-2db6-08e204f9e1a4
Bus Location:   00000000:01:00.0
Architecture:   8.6
I1205 08:00:00.155235 24945 nvc.c:434] shutting down library context
I1205 08:00:00.155296 24948 rpc.c:95] terminating nvcgo rpc service
I1205 08:00:00.155542 24945 rpc.c:135] nvcgo rpc service terminated successfully
I1205 08:00:00.156623 24947 rpc.c:95] terminating driver rpc service
I1205 08:00:00.156671 24945 rpc.c:135] driver rpc service terminated successfully

johndpope · 2022-12-05T11:36:32Z

not sure it helps - I had originally installed the driver from cuda 11.8 - but then when I did nvidia-docker2 install - the driver broke - so i reverted back to the system (auto install) driver.

UPDATE

reading through docs - for
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

this command works fine....


sudo docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
Unable to find image 'nvidia/cuda:11.8.0-base-ubuntu22.04' locally
11.8.0-base-ubuntu22.04: Pulling from nvidia/cuda
301a8b74f71f: Already exists 
35985d37d899: Already exists 
5b7513e7876e: Already exists 
bbf319bc026c: Already exists 
da5c9c5d5ac3: Already exists 
Digest: sha256:83493b3f150cc23f91fb0d2509e491204e33f062355d401662389a80a9091b82
Status: Downloaded newer image for nvidia/cuda:11.8.0-base-ubuntu22.04
Mon Dec  5 23:05:46 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11    Driver Version: 525.60.11    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   44C    P8    25W / 370W |    995MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

ok

it's basically a problem without using sudo...

docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi 
Unable to find image 'nvidia/cuda:11.8.0-base-ubuntu22.04' locally
11.8.0-base-ubuntu22.04: Pulling from nvidia/cuda
301a8b74f71f: Already exists 
35985d37d899: Already exists 
5b7513e7876e: Already exists 
bbf319bc026c: Already exists 
da5c9c5d5ac3: Already exists 
Digest: sha256:83493b3f150cc23f91fb0d2509e491204e33f062355d401662389a80a9091b82
Status: Downloaded newer image for nvidia/cuda:11.8.0-base-ubuntu22.04
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

UPDATE - FIXED.
I don't know if this helps - but on my installation I had
cudnn-local-repo-ubuntu2204-8.6.0.163_1.0-1_amd64.deb + 11.8 cuda
this is incorrect. i was using cog - and this didn't find the error - just assumed it was all working correctly.
updating to latest cudnn - resolved my original issue.
cudnn-local-repo-ubuntu2204-8.7.0.84_1.0-1_amd64.deb

groucho64738 · 2022-12-12T19:34:00Z

I'm having a similar issue on a system I'm using for K8S, and no containers can run that require nvidia drivers with the same error (about libnvidia-ml.so.1). I'm not sure what specific steps broke it for me though. I was able to reproduce the error message on a command line by running the cuda container directly on our node: docker run --gpus=all --runtime=nvidia nvidia/cuda:11.8.0-base-ubuntu20.04 nvidia-smi

I created a debug log for nvidia-container-toolkit:

I1212 19:31:12.254613 192312 nvc.c:376] initializing library context (version=1.10.0, build=395fd41701117121f1fd04ada01e1d7e006a37ae)
I1212 19:31:12.254655 192312 nvc.c:350] using root /run/nvidia/driver
I1212 19:31:12.254660 192312 nvc.c:351] using ldcache /etc/ld.so.cache
I1212 19:31:12.254665 192312 nvc.c:352] using unprivileged user 65534:65534
I1212 19:31:12.254683 192312 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I1212 19:31:12.254778 192312 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
I1212 19:31:12.267767 192321 nvc.c:278] loading kernel module nvidia
E1212 19:31:12.267872 192321 nvc.c:280] could not load kernel module nvidia
I1212 19:31:12.267883 192321 nvc.c:296] loading kernel module nvidia_uvm
E1212 19:31:12.267904 192321 nvc.c:298] could not load kernel module nvidia_uvm
I1212 19:31:12.267914 192321 nvc.c:305] loading kernel module nvidia_modeset
E1212 19:31:12.267934 192321 nvc.c:307] could not load kernel module nvidia_modeset
I1212 19:31:12.268200 192322 rpc.c:71] starting driver rpc service
I1212 19:31:12.268825 192312 rpc.c:135] driver rpc service terminated with signal 15
I1212 19:31:12.268870 192312 nvc.c:434] shutting down library context

Not a lot of help there. If I run nvidia-container-cli -k -d /dev/tty info I get a list of all of the modules and libraries, so that functions. I've tried running the container in privileged mode as well and still get the same error. Each time, I'm root when trying to kick off the container to eliminate that piece as well.

This is an ubuntu 20.04 system, running docker 20.10.18. I've followed the installation directions in the install guide (pretty straightforward to follow)

If there any suggestions of what else to try to debug, I'm willing to give it a try. This has been a real headache.

groucho64738 · 2022-12-13T11:54:50Z

I actually managed to fix this. At some point in time we had uncommented the option root = "/run/nvidia/driver" in /etc/nvidia-container-runtime/config.toml (must have seen directions on this somewhere). My best guess is that we had updated something on the system that made this no longer be a viable option, and after a reboot, everything stopped working. I commented out that option and everything popped up.

To find it, I created a wrapper around nvidia-container-cli:

#!/bin/bash

echo "$@" > /var/tmp/debuginfo
/usr/bin/nvidia-container-cli.real "$@"

That showed me a working and a non-working systems' option that were being passed.

Not working:

--root=/run/nvidia/driver --load-kmods --debug=/var/log/nvidia-container-toolkit.log configure [--ldconfig=@/sbin/ldconfig.real](mailto:--ldconfig=@/sbin/ldconfig.real) --device=all --compute --utility --require=cuda>=11.8 brand=tesla,driver>=450,driver<451 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=unknown,driver>=510,driver<511 brand=nvidia,driver>=510,driver<511 brand=nvidiartx,driver>=510,driver<511 brand=geforce,driver>=510,driver<511 brand=geforcertx,driver>=510,driver<511 brand=quadro,driver>=510,driver<511 brand=quadrortx,driver>=510,driver<511 brand=titan,driver>=510,driver<511 brand=titanrtx,driver>=510,driver<511 brand=unknown,driver>=515,driver<516 brand=nvidia,driver>=515,driver<516 brand=nvidiartx,driver>=515,driver<516 brand=geforce,driver>=515,driver<516 brand=geforcertx,driver>=515,driver<516 brand=quadro,driver>=515,driver<516 brand=quadrortx,driver>=515,driver<516 brand=titan,driver>=515,driver<516 brand=titanrtx,driver>=515,driver<516 --pid=3895576 /var/lib/docker/overlay2/47f7deb4479aa6b8c26f3b6e3ad4a2cd9bd86304736bf9aed68ed4127fbc0d00/merged

Working:

--load-kmods configure [--ldconfig=@/sbin/ldconfig.real](mailto:--ldconfig=@/sbin/ldconfig.real) --device=all --compute --utility --require=cuda>=11.8 brand=tesla,driver>=450,driver<451 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=unknown,driver>=510,driver<511 brand=nvidia,driver>=510,driver<511 brand=nvidiartx,driver>=510,driver<511 brand=geforce,driver>=510,driver<511 brand=geforcertx,driver>=510,driver<511 brand=quadro,driver>=510,driver<511 brand=quadrortx,driver>=510,driver<511 brand=titan,driver>=510,driver<511 brand=titanrtx,driver>=510,driver<511 brand=unknown,driver>=515,driver<516 brand=nvidia,driver>=515,driver<516 brand=nvidiartx,driver>=515,driver<516 brand=geforce,driver>=515,driver<516 brand=geforcertx,driver>=515,driver<516 brand=quadro,driver>=515,driver<516 brand=quadrortx,driver>=515,driver<516 brand=titan,driver>=515,driver<516 brand=titanrtx,driver>=515,driver<516 --pid=2830327 /var/lib/docker/overlay2/59206c16f5a12eadbe2e42287a7ff6aa3559b0666048d7578b29df90e3755d50/merged

johndpope · 2022-12-13T21:49:01Z

From the look of it - first line is 470 second is 511. It does seem like everything can be working fine - and then ubuntu automatically changes driver (rendering a broken system) - I recommend using timeshift to create a snapshot when everything is working (new driver / cuda update etc) - https://github.com/linuxmint/timeshift - it's trivial to roll back to working snapshot and you won't lose any personal files.

ThatCooperLewis · 2022-12-14T23:24:18Z

Trying to build containers on Arch here, installed Docker through docker-desktop originally, but I've also installed nvidia-docker, cuda, cuda-tools, cudnn, and nvidia-container-toolkit on the host machine in an attempt to resolve this.

The only workaround I've found so far is to run docker as root. That resolves this specific issue but, of course, I'd rather not be forced to run all my docker commands via sudo (also Docker Desktop fails to recognize those containers/images).

Some relevant outputs from host machine:

$ ldconfig -p | grep cuda         
        
        libicudata.so.72 (libc6,x86-64) => /usr/lib/libicudata.so.72
        libicudata.so.72 (ELF) => /usr/lib32/libicudata.so.72
        libicudata.so (libc6,x86-64) => /usr/lib/libicudata.so
        libicudata.so (ELF) => /usr/lib32/libicudata.so
        libcuda.so.1 (libc6,x86-64) => /usr/lib/libcuda.so.1
        libcuda.so.1 (libc6) => /usr/lib32/libcuda.so.1
        libcuda.so (libc6,x86-64) => /usr/lib/libcuda.so
        libcuda.so (libc6) => /usr/lib32/libcuda.so

$ nvidia-smi
Wed Dec 14 15:20:49 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11    Driver Version: 525.60.11    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:0C:00.0  On |                  N/A |
|  0%   32C    P8    30W / 370W |    937MiB / 10240MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

$ uname -a
6.0.12-arch1-1 NVIDIA/nvidia-docker#1 SMP PREEMPT_DYNAMIC Thu, 08 Dec 2022 11:03:38 +0000 x86_64 GNU/Linux

ThatCooperLewis · 2022-12-15T08:33:47Z

I changed distros and still had a very similar issue with Docker Desktop + nvidia-docker together. But adding this workaround to the nvidia runtime config seemed to fix things for me. [UPDATE: It does not]

$ vi /etc/nvidia-container-runtime/config.toml

no-cgroups = true

Unsure of whether this was the cause in my old distro (EndeavorOS), but I will try to confirm later.

shiwakant · 2022-12-26T22:31:47Z

All instructions were helpful, but I had to start docker, docker build, and docker run at root privileges to make it work!!!
Even after repeated hard tries, unable to run at a user-level permissions.

pochoi · 2023-01-08T07:07:45Z

All instructions were helpful, but I had to start docker, docker build, and docker run at root privileges to make it work!!! Even after repeated hard tries, unable to run at a user-level permissions.

I have the same error [nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1](https://github.com/NVIDIA/nvidia-container-toolkit/issues/154) when running docker without sudo.

Is there possible ways to get thing work without sudo?

ThatCooperLewis · 2023-01-08T19:50:32Z

@shiwakant @pochoi You can get this working by avoiding Docker Desktop and instead setting up Docker Rootless Mode

lbadi · 2023-01-19T19:42:57Z

I know that this might be a dumb answer, but i was having the same issues and got fixed after i log in into docker login ghcr.io -u *** --password-stdin .

turboazot · 2023-03-24T19:04:54Z

From my side I used:

sudo ldconfig

Worked for me. But in case if you are using docker image with dind and nvidia-docker integration in it, execute this in entrypoint script, otherwise it may not work.

justmiles · 2023-03-30T19:10:08Z

I ran into this as well and was simply missing the nvidia-driver-<version> and nvidia-dkms-<version> packages. Would be worth double-checking the actual Nvidia drivers are installed.

pfcouto · 2023-04-23T21:54:27Z

Hello, can I bring up this topic again?

1. Issue or feature description

Upon running the command docker run --privileged --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi i get the error

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

2. Steps to reproduce the issue

I installed nvidia through Fedora Docs, not Nvidia, so as an example nvcc --version outputs an error saying that it does not recognize nvcc command but in my host machine I can run nvidia-smi

The commands I used to install nvidia are the following:

sudo dnf install akmod-nvidia
sudo dnf install xorg-x11-drv-nvidia-cuda

And as visible in the following image I am able to run the command nvidia-smi in my host machine

I followed this guide on how yo install nvidia-docker - - and did the following:

curl -s -L https://nvidia.github.io/libnvidia-container/centos8/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
##############################
sudo dnf install nvidia-docker2
# Edit /etc/nvidia-container-runtime/config.toml and disable cgroups:
no-cgroups = true

sudo reboot
##############################
sudo systemctl start docker.service
##############################
docker run --privileged --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

and upon running this docker command I get the error show in ### 1.

The thing is, I have the file that it says it is missing (check the following image), so maybe it is looking for it in a different directory?

3. Information to attach (optional if deemed irrelevant)

uname -a:

Linux fedora 6.2.10-200.fc37.x86_64 NVIDIA/nvidia-docker#1 SMP PREEMPT_DYNAMIC Thu Apr  6 23:30:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

docker version

Client: Docker Engine - Community
 Cloud integration: v1.0.31
 Version:           23.0.3
 API version:       1.41 (downgraded from 1.42)
 Go version:        go1.19.7
 Git commit:        3e7cbfd
 Built:             Tue Apr  4 22:10:33 2023
 OS/Arch:           linux/amd64
 Context:           desktop-linux

Server: Docker Desktop 4.18.0 (104112)
 Engine:
  Version:          20.10.24
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.19.7
  Git commit:       5d6db84
  Built:            Tue Apr  4 18:18:42 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.18
  GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

rpm -qa '*nvidia*'

 nvidia-gpu-firmware-20230310-148.fc37.noarch
xorg-x11-drv-nvidia-kmodsrc-530.41.03-1.fc37.x86_64
xorg-x11-drv-nvidia-cuda-libs-530.41.03-1.fc37.x86_64
xorg-x11-drv-nvidia-libs-530.41.03-1.fc37.x86_64
nvidia-settings-530.41.03-1.fc37.x86_64
xorg-x11-drv-nvidia-power-530.41.03-1.fc37.x86_64
xorg-x11-drv-nvidia-530.41.03-1.fc37.x86_64
akmod-nvidia-530.41.03-1.fc37.x86_64
kmod-nvidia-6.2.9-200.fc37.x86_64-530.41.03-1.fc37.x86_64
nvidia-persistenced-530.41.03-1.fc37.x86_64
xorg-x11-drv-nvidia-cuda-530.41.03-1.fc37.x86_64
xorg-x11-drv-nvidia-libs-530.41.03-1.fc37.i686
xorg-x11-drv-nvidia-cuda-libs-530.41.03-1.fc37.i686
kmod-nvidia-6.2.10-200.fc37.x86_64-530.41.03-1.fc37.x86_64
nvidia-container-toolkit-base-1.13.0-1.x86_64
libnvidia-container1-1.13.0-1.x86_64
libnvidia-container-tools-1.13.0-1.x86_64
nvidia-container-toolkit-1.13.0-1.x86_64
nvidia-docker2-2.13.0-1.noarch

nvidia-container-cli -V

cli-version: 1.13.0
lib-version: 1.13.0
build date: 2023-03-31T13:12+00:00
build revision: 20823911e978a50b33823a5783f92b6e345b241a
build compiler: gcc 8.5.0 20210514 (Red Hat 8.5.0-18)
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

Thanks for your help!

elezar · 2023-04-24T07:59:57Z

@pfcouto and others that show this behaviour. Please enable debug logging for the nvidia-container-cli in the /etc/nvidia-container-toolkit/config.toml by uncommenting the #debug = line in that section.

Running a container should then generate a log at /var/log/nvidia-container-toolkit.log which may help to further debug this.

Note that the NVIDIA Container CLI needs to load libnvidia-ml.so.1 to retrieve the required information about the GPUs in the system. We have seen this behaviour when Docker Desktop is used, for example, since the hook is then executed in a VM that does not have access to the libraries and devices on the host. How is docker installed in this case?

Note, if you're able to install a recent version of podman, this could be an alternative as a CDI specification could be generated instead of relying on the nvidia-container-cli-based injection.

pfcouto · 2023-04-24T11:01:35Z

Hello @elezar, I will do what you said said. Thanks! I think one of the issues is that since I installed Nvidia-Drivers through RPMfusion the file is not in the default location. Nvidia-Docker is looking for the file in a location, and I have the file in another location. How can I change the docker image to access my file that is in a different location?

As shown in the picture, when I installed the Nvidia-Drivers through RPM it installed a flatpak, but the file is present. The thing is the file is not where nvidia-docker expects it to be.

Can I create my own Dockerfile: Like:

FROM nvidia/docker
COPY (my lib file location) (Docker image location)

Or just change the default location in local machine to the correct location just to test if it works.

elezar · 2023-04-24T11:48:32Z

@pfcouto if the drivers are at a different location to expected, you could look at setting the root option in the config.toml. We use this setting when running the toolkit using our driver container. In this case we isntall the driver (and device nodes) to /run/nvidia/driver and root = /run/nvidia/driver is specified in the config.

pfcouto · 2023-04-24T23:45:52Z

Hello @elezar .I do not have the folder /etc/nvidia-container-toolkit/ that you mentioned. However I do have a folder nvidia-container-runtime which as a config,toml file, as shown in the picture. Is it ok for me to change in this file what you said to change in the other? Thanks!

pfcouto · 2023-04-24T23:59:46Z

I did change the file as visible in the first picture. Uncommented the debug line, and changed root to a directory where I have the libnvidia-ml.so.1 file, I don't know if I should have changed this, but I did. Ran the command docker run --privileged --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi and it outputed the same error:

Unable to find image 'nvidia/cuda:11.0.3-base-ubuntu20.04' locally
11.0.3-base-ubuntu20.04: Pulling from nvidia/cuda
d7bfe07ed847: Pull complete 
75eccf561042: Pull complete 
191419884744: Pull complete 
a17a942db7e1: Pull complete 
16156c70987f: Pull complete 
Digest: sha256:57455121f3393b7ed9e5a0bc2b046f57ee7187ea9ec562a7d17bf8c97174040d
Status: Downloaded newer image for nvidia/cuda:11.0.3-base-ubuntu20.04
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
ERRO[0003] error waiting for container:

Then I went ahead and tried to look into the log file but it was not created...
cat /var/log/nvidia-container-toolkit.log:

cat: /var/log/nvidia-container-toolkit.log: No such file or directory

pfcouto · 2023-04-27T13:21:14Z

Hi again @elezar, also, I don't have the folder /run/nvidia/driver

lishoulong · 2023-05-02T07:53:31Z

maybe just not install CUDA and NVIDIA container

elezar · 2023-05-02T09:00:16Z

Hi again @elezar, also, I don't have the folder /run/nvidia/driver

Sorry for the lack of clarity. I was using /run/nvidia/driver as an example of a path we use when intalling the driver using our driver container. The NVIDIA Container Toolkit considers the root setting when looking for libnvidia-ml.so.1 (the standard lib paths are prepended) and if your installation has these libraries at a non-standard location this will help to locate them.

Since your output in one of your comments does show /usr/lib64/libnvidia-ml.so.1, could you confirm where this symlink points? (Your output also shows some flatpack location).

Could you link the Fedora docs you used to install the driver?

AskAlice · 2023-09-01T20:34:30Z

I have this issue unless I run as root. Using docker-desktop

❯ stat /usr/lib/libnvidia-ml.*
  File: /usr/lib/libnvidia-ml.so -> libnvidia-ml.so.1
  Size: 17              Blocks: 8          IO Block: 4096   symbolic link
Device: 0,26    Inode: 1753370     Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-09-01 13:46:31.971672402 -0600
Modify: 2023-08-22 11:37:15.000000000 -0600
Change: 2023-08-26 22:24:35.978217529 -0600
 Birth: 2023-08-26 22:24:35.978217529 -0600
  File: /usr/lib/libnvidia-ml.so.1 -> libnvidia-ml.so.535.104.05
  Size: 26              Blocks: 8          IO Block: 4096   symbolic link
Device: 0,26    Inode: 1753371     Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-09-01 13:46:31.971672402 -0600
Modify: 2023-08-22 11:37:15.000000000 -0600
Change: 2023-08-26 22:24:35.978217529 -0600
 Birth: 2023-08-26 22:24:35.978217529 -0600
  File: /usr/lib/libnvidia-ml.so.535.104.05
  Size: 1815872         Blocks: 3552       IO Block: 4096   regular file
Device: 0,26    Inode: 1753372     Links: 1
Access: (0777/-rwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-09-01 14:25:25.495551694 -0600
Modify: 2023-08-22 11:37:15.000000000 -0600
Change: 2023-09-01 14:24:06.427728737 -0600
 Birth: 2023-08-26 22:24:35.978217529 -0600

It also seems it is reproducable with the PKGBUILD i created here https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/issues/17#note_1530784413

here is my config.toml

disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false

[nvidia-container-cli]
#root = "/run/nvidia/driver"
path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
ldcache = "/etc/ld.so.cache"
load-kmods = true
no-cgroups = false
#user = "root:video"
ldconfig = "/sbin/ldconfig"

[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"

# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
    "docker-runc",
    "runc",
]

mode = "auto"

    [nvidia-container-runtime.modes.csv]

    mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"

bkocis · 2023-09-28T13:14:36Z

I had the same issue. For me a reinstall of docker fixed the issue:

I run as a bash script:

sudo apt-get update

sudo apt install apt-transport-https ca-certificates curl software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"

apt-cache policy docker-ce

sudo apt install docker-ce

RobQuistNL · 2023-10-25T18:38:05Z

Looks like this just doesn't work with docker desktop.

When you run the script that @bkocis shared - you're installing docker-ce, most likely next to docker desktop. So the sudo version of docker runs the CE version, and the regular one will use your docker desktop version.

At least, this is what happens for me :)

$ docker run --privileged --gpus all nvidia/cuda:12.2.2-runtime-ubuntu22.04 nvidia-smi
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
ERRO[0000] error waiting for container:

$ sudo docker run --privileged --gpus all nvidia/cuda:12.2.2-runtime-ubuntu22.04 nvidia-smi

==========
== CUDA ==
==========

CUDA Version 12.2.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Wed Oct 25 18:34:56 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  Off |
|  0%   51C    P0    72W / 450W |   1493MiB / 24564MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Before installing docker-ce, you'd get this error:

docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
See 'docker run --help'.

JosephKuchar · 2023-10-26T12:54:00Z

Hi all,

I'm having this same issue. It's perplexing because everything was working as of a few weeks ago, but it seems that since we've had to reboot the machine that somehow docker's ability to access the GPU has broken. In my case running docker as sudo does not make a difference.

sudo docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

The output of nvidia-smi is the following:

Wed Oct 25 16:30:24 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro RTX 5000                 On | 00000000:B3:00.0 Off |                  Off |
| 33%   28C    P8               13W / 230W|     71MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2463      G   /usr/lib/xorg/Xorg                           63MiB |
|    0   N/A  N/A      2957      G   /usr/bin/gnome-shell                          5MiB |
+---------------------------------------------------------------------------------------+

I also edited the /etc/nvidia-container-toolkit/config.toml by uncommenting the #debug = line in that section. The error suggests it's not able to find the nvidia devices:

I1026 12:51:01.432712 1726142 nvc.c:376] initializing library context (version=1.12.0, build=7678e1af094d865441d0bc1b97>I1026 12:51:01.432857 1726142 nvc.c:350] using root /
I1026 12:51:01.432876 1726142 nvc.c:351] using ldcache /etc/ld.so.cache
I1026 12:51:01.432891 1726142 nvc.c:352] using unprivileged user 65534:65534
I1026 12:51:01.432931 1726142 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for>I1026 12:51:01.433368 1726142 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W1026 12:51:01.436341 1726142 nvc.c:258] failed to detect NVIDIA devices
I1026 12:51:01.436817 1726149 nvc.c:278] loading kernel module nvidia
I1026 12:51:01.437436 1726149 nvc.c:282] running mknod for /dev/nvidiactl
I1026 12:51:01.437525 1726149 nvc.c:290] running mknod for all nvcaps in /dev/nvidia-caps
I1026 12:51:01.453626 1726149 nvc.c:218] running mknod for /dev/nvidia-caps/nvidia-cap1 from /proc/driver/nvidia/capabi>I1026 12:51:01.453787 1726149 nvc.c:218] running mknod for /dev/nvidia-caps/nvidia-cap2 from /proc/driver/nvidia/capabi>I1026 12:51:01.456764 1726149 nvc.c:296] loading kernel module nvidia_uvm
I1026 12:51:01.456957 1726149 nvc.c:300] running mknod for /dev/nvidia-uvm
I1026 12:51:01.457046 1726149 nvc.c:305] loading kernel module nvidia_modeset
I1026 12:51:01.457288 1726149 nvc.c:309] running mknod for /dev/nvidia-modeset
I1026 12:51:01.458019 1726150 rpc.c:71] starting driver rpc service
I1026 12:51:01.459088 1726142 rpc.c:132] driver rpc service terminated with signal 15
I1026 12:51:01.459205 1726142 nvc.c:434] shutting down library context

As I said, this was working a few weeks ago, so I'm not sure what's changed. We haven't updated any drivers or anything of that nature that I'm aware of. Any help appreciated!

bkocis · 2023-10-30T12:19:05Z

@JosephKuchar try reinstalling docker - I had similar problem, with the issue being the missing runtime (see docker info).
the solution was for me to reinstall docker NVIDIA/nvidia-docker#1648 (comment)

destefy · 2023-11-08T18:23:24Z

Thanks @bkocis! Worked like a charm!

archenroot · 2023-12-07T19:44:41Z

Hi guys,

I hit same issue on Ubuntu 22.04 LTS, I followed instructions to reinstall as bellow (note I have installed as well docker-desktop initially)

sudo apt-get purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-ce-rootless-extras
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl restart docker

And I was able to run this machine from above comment:
docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
and I was finally capable to run RAPIDS:
docker run --gpus all --pull always --rm -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/notebooks:23.12a-cuda11.2-py3.10

At moment the docker-desktop is uninstalled. I will try to install it again and run the tests.

archenroot · 2023-12-07T20:15:03Z

I additionally installed docker-desktop again and rebooted and gpu in containers still works....

goldwater668 · 2023-12-08T11:52:30Z

@johndpope
I installed docker-desktop in Windows 10. The graphics card driver is 546.01 and the following error is reported. How should I solve it?
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

archenroot · 2023-12-08T12:02:15Z

@goldwater668 It seems that reinstalling the docker engine help on Linux machines..and as per my understanding its root cause is maybe in docker-desktop somewhere, but I didn't find root cause myself..

goldwater668 · 2023-12-08T13:10:56Z

@elezar nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

johndpope · 2023-12-09T23:54:44Z

@goldwater668 - I see you're on windows - try running things as administrator.

Starting Docker image cog-cog-svd-base and running setup()...
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
ⅹ Failed to start container: exit status 127

running sudo fixed this for me.

lihkinVerma · 2023-12-18T22:02:41Z

Hi guys,

I hit same issue on Ubuntu 22.04 LTS, I followed instructions to reinstall as bellow (note I have installed as well docker-desktop initially)
sudo apt-get purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-ce-rootless-extras
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl restart docker
And I was able to run this machine from above comment: docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark and I was finally capable to run RAPIDS: docker run --gpus all --pull always --rm -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/notebooks:23.12a-cuda11.2-py3.10

At moment the docker-desktop is uninstalled. I will try to install it again and run the tests.

This worked for me. Thankyou so much

combofish · 2023-12-22T02:53:15Z

Hi guys,
I hit same issue on Ubuntu 22.04 LTS, I followed instructions to reinstall as bellow (note I have installed as well docker-desktop initially)
sudo apt-get purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-ce-rootless-extras
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl restart docker
And I was able to run this machine from above comment: docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark and I was finally capable to run RAPIDS: docker run --gpus all --pull always --rm -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/notebooks:23.12a-cuda11.2-py3.10
At moment the docker-desktop is uninstalled. I will try to install it again and run the tests.
This worked for me. Thankyou so much

huangpan2507 · 2024-01-04T08:18:14Z

Hi guys,

I hit same issue on Ubuntu 22.04 LTS, I followed instructions to reinstall as bellow (note I have installed as well docker-desktop initially)
sudo apt-get purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-ce-rootless-extras
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl restart docker
And I was able to run this machine from above comment: docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark and I was finally capable to run RAPIDS: docker run --gpus all --pull always --rm -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/notebooks:23.12a-cuda11.2-py3.10

At moment the docker-desktop is uninstalled. I will try to install it again and run the tests.

Thanks，I had met the same issue, this solution helps me!!!!! But please note， all the docker images will be deleted because of sudo rm -rf /var/lib/docker !!!!

zebin-huang · 2024-01-22T05:22:35Z

I had the same issue. For me a reinstall of docker fixed the issue:

I run as a bash script:

sudo apt-get update

sudo apt install apt-transport-https ca-certificates curl software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"

apt-cache policy docker-ce

sudo apt install docker-ce

This works for me!

sjmach · 2024-02-08T16:20:00Z

I had the same issue. For me a reinstall of docker fixed the issue:

I run as a bash script:

sudo apt-get update

sudo apt install apt-transport-https ca-certificates curl software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"

apt-cache policy docker-ce

sudo apt install docker-ce

This works for me too. Please replace appropriate version of Ubuntu in the 4 line by running this command first in your terminal and getting the codename string lsb_release -a

clh15683 · 2024-02-14T12:34:49Z

If you encounter this with Docker Desktop, make sure that you enable the WSL integration for your distribution under Settings->Resources->WSL Integration. It seems that Docker Desktop occasionally forgets this setting on Updates.

GabrielDornelles · 2024-02-25T20:47:06Z

I followed this conversation last night, turned off the pc and good. Today I went back to work, and it wasnt working anymore, same error occuring of not finding libnvidia-ml.so.1.

I don't really know how to solve this, its package issues as pointed by others. What I had to do to make it work again was

sudo snap remove --purge docker

Removing the docker stuff (the previous long shell command is what worked before, but now it doesnt).

Then re installing everything again from docker instructions:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

gerardo8a · 2024-03-06T21:36:19Z

I had the issue with the nvidia library as well and after looking into one of my working nodes the /usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml was different the following two values where set different and that was the root of the error.

working

....
[nvidia-container-cli]
  environment = []
  ldconfig = "@/sbin/ldconfig.real"
  load-kmods = true
  path = "/usr/local/nvidia/toolkit/nvidia-container-cli"
  root = "/"
...

Not working

...
[nvidia-container-cli]
  environment = []
  ldconfig = "@/run/nvidia/driver/sbin/ldconfig.real"
  load-kmods = true
  path = "/usr/local/nvidia/toolkit/nvidia-container-cli"
  root = "/run/nvidia/driver"
...

elezar · 2024-03-07T07:50:41Z

@gerardo8a how was your NVIDIA Container Toolkit and the NVIDIA driver installed? Your non-working config seems to reference a containerized driver insallation (usually under the GPU Operator), whereas your working config references a driver installation on the host (note the root and ldconfig values).

iganev · 2024-04-30T20:11:19Z

I followed this conversation last night, turned off the pc and good. Today I went back to work, and it wasnt working anymore, same error occuring of not finding libnvidia-ml.so.1.

I don't really know how to solve this, its package issues as pointed by others. What I had to do to make it work again was
sudo snap remove --purge docker
Removing the docker stuff (the previous long shell command is what worked before, but now it doesnt).

Then re installing everything again from docker instructions:
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

As Gabe here describes, if reinstalling docker apt reinstall docker-ce fixes your issue temporarily, make sure you don't have another docker installed through snap ( snap list ).

zhangxianwei2015 · 2024-05-04T15:34:56Z

For my case, configure the container runtime for Docker running in [Rootless mode] (https://docs.docker.com/engine/security/rootless/) works for me.

Updating the docker to 12.5 led to the below described problems. Since the testing output of 12.1 matches 12.5, and we don't actually use any cuda features later than 12.1 (which are minor updates) in the compiler, this PR reverts back to the 12.1 image. We can update the docker later only when we really need to (probably when cuda 13 is released). For the purposes of intel/llvm CI 12.1 is sufficient. This fixes the "latest" docker image, allowing other updates to the docker image to be made in the future. CUDA docker issues: Depending on the host setup of the runners, there are various issues on recent nvidia docker images related to interactions with the host, whereby nvidia devices are not visible. - NVIDIA/nvidia-container-toolkit#48 - NVIDIA/nvidia-docker#1671 - NVIDIA/nvidia-container-toolkit#154 Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

nicholasmullikin mentioned this issue Jan 31, 2023

Docker Desktop and Nvidia Runtime inoperable across multiple distros #229

Open

9 tasks

AbdBarho mentioned this issue Apr 9, 2023

nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1 AbdBarho/stable-diffusion-webui-docker#406

Closed

2 tasks

dmarx mentioned this issue Apr 24, 2023

Error building torch on clean docker compose --profile auto up --build AbdBarho/stable-diffusion-webui-docker#420

Closed

2 tasks

allisontw mentioned this issue Jun 6, 2023

nvidia-container-cli: initialization error on Ubuntu22.04LTS #250

Open

5 tasks

grg2rsr mentioned this issue Jul 27, 2023

Using spikeinterface with a spike sorter installed in a different environment SpikeInterface/spikeinterface#1890

Open

w111liang222 mentioned this issue Aug 8, 2023

Cannot see the Config UI w111liang222/lidar-slam-detection#8

Open

elezar transferred this issue from NVIDIA/nvidia-docker Nov 19, 2023

JayNakrani mentioned this issue Dec 9, 2023

Any Explanation of this error on Fedora Linux 39 Design SecureAI-Tools/SecureAI-Tools#26

Closed

gregh3285 mentioned this issue Jan 17, 2024

Podman + PRIME (dual gpu intel + nvidia) not working netbrain/zwift#21

Open

ozbillwang mentioned this issue Jan 22, 2024

[BUG] <title> netease-youdao/QAnything#5

Open

JackAKirk mentioned this issue Jun 14, 2024

[CI] Use cuda 12.1 docker image again. intel/llvm#14179

Merged

JackAKirk mentioned this issue Jul 2, 2024

Error compiling DPC++ intel/llvm#14375

Closed

nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1 #154

nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1 #154

Comments

Dan-Burns commented Dec 2, 2022

3. Information to attach (optional if deemed irrelevant)

elezar commented Dec 2, 2022

Dan-Burns commented Dec 2, 2022 • edited Loading

johndpope commented Dec 5, 2022 • edited Loading

johndpope commented Dec 5, 2022 • edited Loading

groucho64738 commented Dec 12, 2022 • edited Loading

groucho64738 commented Dec 13, 2022

johndpope commented Dec 13, 2022

ThatCooperLewis commented Dec 14, 2022 • edited Loading

ThatCooperLewis commented Dec 15, 2022 • edited Loading

shiwakant commented Dec 26, 2022

pochoi commented Jan 8, 2023

ThatCooperLewis commented Jan 8, 2023

lbadi commented Jan 19, 2023

turboazot commented Mar 24, 2023 • edited Loading

justmiles commented Mar 30, 2023

pfcouto commented Apr 23, 2023

1. Issue or feature description

2. Steps to reproduce the issue

3. Information to attach (optional if deemed irrelevant)

elezar commented Apr 24, 2023

pfcouto commented Apr 24, 2023

elezar commented Apr 24, 2023

pfcouto commented Apr 24, 2023

pfcouto commented Apr 24, 2023

pfcouto commented Apr 27, 2023

lishoulong commented May 2, 2023 • edited Loading

elezar commented May 2, 2023

AskAlice commented Sep 1, 2023 • edited Loading

bkocis commented Sep 28, 2023

RobQuistNL commented Oct 25, 2023 • edited Loading

JosephKuchar commented Oct 26, 2023

bkocis commented Oct 30, 2023

destefy commented Nov 8, 2023

archenroot commented Dec 7, 2023 • edited Loading

archenroot commented Dec 7, 2023

goldwater668 commented Dec 8, 2023

archenroot commented Dec 8, 2023 • edited Loading

goldwater668 commented Dec 8, 2023

johndpope commented Dec 9, 2023

lihkinVerma commented Dec 18, 2023 • edited Loading

combofish commented Dec 22, 2023

huangpan2507 commented Jan 4, 2024 • edited Loading

zebin-huang commented Jan 22, 2024

sjmach commented Feb 8, 2024

clh15683 commented Feb 14, 2024

GabrielDornelles commented Feb 25, 2024

gerardo8a commented Mar 6, 2024 • edited Loading

elezar commented Mar 7, 2024

iganev commented Apr 30, 2024

zhangxianwei2015 commented May 4, 2024

Dan-Burns commented Dec 2, 2022 •

edited

Loading

johndpope commented Dec 5, 2022 •

edited

Loading

johndpope commented Dec 5, 2022 •

edited

Loading

groucho64738 commented Dec 12, 2022 •

edited

Loading

ThatCooperLewis commented Dec 14, 2022 •

edited

Loading

ThatCooperLewis commented Dec 15, 2022 •

edited

Loading

turboazot commented Mar 24, 2023 •

edited

Loading

lishoulong commented May 2, 2023 •

edited

Loading

AskAlice commented Sep 1, 2023 •

edited

Loading

RobQuistNL commented Oct 25, 2023 •

edited

Loading

archenroot commented Dec 7, 2023 •

edited

Loading

archenroot commented Dec 8, 2023 •

edited

Loading

lihkinVerma commented Dec 18, 2023 •

edited

Loading

huangpan2507 commented Jan 4, 2024 •

edited

Loading

gerardo8a commented Mar 6, 2024 •

edited

Loading