Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Desktop and Nvidia Runtime inoperable across multiple distros #229

Open
9 tasks done
ThatCooperLewis opened this issue Dec 17, 2022 · 9 comments
Open
9 tasks done

Comments

@ThatCooperLewis
Copy link

ThatCooperLewis commented Dec 17, 2022

1. Issue or feature description

On fresh Arch (EndeavourOS) and Ubuntu (20.04 and 22.04) installations, attempts to utilize nvidia runtime through any image via Docker Desktop fail with this error:

$ docker run --rm --gpus all nvidia/cuda:12.0.0-devel-ubuntu22.04 nvidia-smi
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

2. Steps to reproduce the issue

  1. Install the prerequisites for Docker on Linux
  2. Install Docker Desktop
  3. Nvidia CUDA Pre-Installation as well as CUDA installation as mentioned in that doc
  4. CUDA Post-Installation
  5. NCT Install
  6. Ensuring the runtime was installed and configured. Especially sudo dockerd --add-runtime=nvidia=/usr/bin/nvidia-container-runtime
  7. Fixing the runtime config.toml as mentioned here

At this point, there are two docker contexts installed.

$ docker context ls -q
default
desktop-linux

Any GPU-related image only succeeds if you docker context use default, but using desktop-linux context fails.

3. Information to attach (optional if deemed irrelevant)

  • Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
I1217 06:20:48.926631 18367 nvc.c:376] initializing library context (version=1.11.0, build=c8f267be0bac1c654d59ad4ea5df907141149977)
I1217 06:20:48.926671 18367 nvc.c:350] using root /
I1217 06:20:48.926677 18367 nvc.c:351] using ldcache /etc/ld.so.cache
I1217 06:20:48.926685 18367 nvc.c:352] using unprivileged user 1000:1000
I1217 06:20:48.926705 18367 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I1217 06:20:48.926800 18367 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W1217 06:20:48.928354 18368 nvc.c:273] failed to set inheritable capabilities
W1217 06:20:48.928390 18368 nvc.c:274] skipping kernel modules load due to failure
I1217 06:20:48.928629 18369 rpc.c:71] starting driver rpc service
I1217 06:20:48.935890 18370 rpc.c:71] starting nvcgo rpc service
I1217 06:20:48.936581 18367 nvc_info.c:766] requesting driver information with ''
I1217 06:20:48.937599 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.525.60.13
I1217 06:20:48.937658 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.525.60.13
I1217 06:20:48.937690 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.525.60.13
I1217 06:20:48.937722 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.525.60.13
I1217 06:20:48.937767 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.525.60.13
I1217 06:20:48.937808 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.525.60.13
I1217 06:20:48.937839 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.525.60.13
I1217 06:20:48.937874 18367 nvc_info.c:175] skipping /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
I1217 06:20:48.937914 18367 nvc_info.c:175] skipping /usr/lib/x86_64-linux-gnu/libnvidia-ml.so
I1217 06:20:48.937944 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.525.60.13
I1217 06:20:48.937973 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.525.60.13
I1217 06:20:48.938001 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.525.60.13
I1217 06:20:48.938028 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.525.60.13
I1217 06:20:48.938070 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.525.60.13
I1217 06:20:48.938109 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.525.60.13
I1217 06:20:48.938138 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.525.60.13
I1217 06:20:48.938168 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.525.60.13
I1217 06:20:48.938208 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.525.60.13
I1217 06:20:48.938248 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.525.60.13
I1217 06:20:48.938488 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcudadebugger.so.525.60.13
I1217 06:20:48.938517 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.525.60.13
I1217 06:20:48.938633 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.525.60.13
I1217 06:20:48.938662 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.525.60.13
I1217 06:20:48.938691 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.525.60.13
I1217 06:20:48.938720 18367 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.525.60.13
I1217 06:20:48.938786 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.525.60.13
I1217 06:20:48.938814 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.525.60.13
I1217 06:20:48.938856 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.525.60.13
I1217 06:20:48.938897 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.525.60.13
I1217 06:20:48.938925 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.525.60.13
I1217 06:20:48.938965 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.525.60.13
I1217 06:20:48.938992 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.525.60.13
I1217 06:20:48.939019 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.525.60.13
I1217 06:20:48.939046 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.525.60.13
I1217 06:20:48.939085 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.525.60.13
I1217 06:20:48.939123 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.525.60.13
I1217 06:20:48.939158 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.525.60.13
I1217 06:20:48.939186 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.525.60.13
I1217 06:20:48.939237 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libcuda.so.525.60.13
I1217 06:20:48.939285 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.525.60.13
I1217 06:20:48.939314 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.525.60.13
I1217 06:20:48.939341 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.525.60.13
I1217 06:20:48.939367 18367 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.525.60.13
W1217 06:20:48.939384 18367 nvc_info.c:399] missing library libnvidia-ml.so
W1217 06:20:48.939388 18367 nvc_info.c:399] missing library libnvidia-nscq.so
W1217 06:20:48.939393 18367 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so
W1217 06:20:48.939398 18367 nvc_info.c:399] missing library libnvidia-pkcs11.so
W1217 06:20:48.939402 18367 nvc_info.c:399] missing library libvdpau_nvidia.so
W1217 06:20:48.939409 18367 nvc_info.c:399] missing library libnvidia-ifr.so
W1217 06:20:48.939416 18367 nvc_info.c:399] missing library libnvidia-cbl.so
W1217 06:20:48.939421 18367 nvc_info.c:403] missing compat32 library libnvidia-cfg.so
W1217 06:20:48.939428 18367 nvc_info.c:403] missing compat32 library libnvidia-nscq.so
W1217 06:20:48.939435 18367 nvc_info.c:403] missing compat32 library libcudadebugger.so
W1217 06:20:48.939444 18367 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so
W1217 06:20:48.939449 18367 nvc_info.c:403] missing compat32 library libnvidia-allocator.so
W1217 06:20:48.939454 18367 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so
W1217 06:20:48.939458 18367 nvc_info.c:403] missing compat32 library libnvidia-ngx.so
W1217 06:20:48.939466 18367 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so
W1217 06:20:48.939473 18367 nvc_info.c:403] missing compat32 library libnvidia-ifr.so
W1217 06:20:48.939481 18367 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so
W1217 06:20:48.939487 18367 nvc_info.c:403] missing compat32 library libnvoptix.so
W1217 06:20:48.939492 18367 nvc_info.c:403] missing compat32 library libnvidia-cbl.so
I1217 06:20:48.939771 18367 nvc_info.c:299] selecting /usr/bin/nvidia-smi
I1217 06:20:48.939787 18367 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump
I1217 06:20:48.939803 18367 nvc_info.c:299] selecting /usr/bin/nvidia-persistenced
I1217 06:20:48.939827 18367 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-control
I1217 06:20:48.939842 18367 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-server
W1217 06:20:48.939902 18367 nvc_info.c:425] missing binary nv-fabricmanager
W1217 06:20:48.939925 18367 nvc_info.c:349] missing firmware path /lib/firmware/nvidia/525.60.13/gsp.bin
I1217 06:20:48.939947 18367 nvc_info.c:529] listing device /dev/nvidiactl
I1217 06:20:48.939951 18367 nvc_info.c:529] listing device /dev/nvidia-uvm
I1217 06:20:48.939956 18367 nvc_info.c:529] listing device /dev/nvidia-uvm-tools
I1217 06:20:48.939960 18367 nvc_info.c:529] listing device /dev/nvidia-modeset
I1217 06:20:48.939983 18367 nvc_info.c:343] listing ipc path /run/nvidia-persistenced/socket
W1217 06:20:48.940009 18367 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket
W1217 06:20:48.940023 18367 nvc_info.c:349] missing ipc path /tmp/nvidia-mps
I1217 06:20:48.940028 18367 nvc_info.c:822] requesting device information with ''
I1217 06:20:48.945723 18367 nvc_info.c:713] listing device /dev/nvidia0 (GPU-024a9805-2239-4784-b51f-6857a5f87d21 at 00000000:0c:00.0)
NVRM version:   525.60.13
CUDA version:   12.0

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 3080
Brand:          GeForce
GPU UUID:       GPU-024a9805-2239-4784-b51f-6857a5f87d21
Bus Location:   00000000:0c:00.0
Architecture:   8.6
I1217 06:20:48.945765 18367 nvc.c:434] shutting down library context
I1217 06:20:48.945785 18370 rpc.c:95] terminating nvcgo rpc service
I1217 06:20:48.946151 18367 rpc.c:135] nvcgo rpc service terminated successfully
I1217 06:20:48.947952 18369 rpc.c:95] terminating driver rpc service
I1217 06:20:48.948093 18367 rpc.c:135] driver rpc service terminated successfully
  • Kernel version from uname -a
    5.15.0-56-generic NVIDIA/nvidia-docker#62~20.04.1-Ubuntu SMP Tue Nov 22 21:24:20 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Any relevant kernel output lines from dmesg
    [drm:nv_drm_master_set [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000c00] Failed to grab modeset ownership
  • Driver information from nvidia-smi -a
==============NVSMI LOG==============

Timestamp                                 : Fri Dec 16 22:33:01 2022
Driver Version                            : 525.60.13
CUDA Version                              : 12.0

Attached GPUs                             : 1
GPU 00000000:0C:00.0
    Product Name                          : NVIDIA GeForce RTX 3080
    Product Brand                         : GeForce
    Product Architecture                  : Ampere
    Display Mode                          : Enabled
    Display Active                        : Enabled
    Persistence Mode                      : Enabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-024a9805-2239-4784-b51f-6857a5f87d21
    Minor Number                          : 0
    VBIOS Version                         : 94.02.26.48.1F
    MultiGPU Board                        : No
    Board ID                              : 0xc00
    Board Part Number                     : N/A
    GPU Part Number                       : 2206-200-A1
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.03.03
        OEM Object                        : 2.0
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x0C
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x220610DE
        Bus Id                            : 00000000:0C:00.0
        Sub System Id                     : 0x87AC1043
        GPU Link Info
            PCIe Generation
                Max                       : 4
                Current                   : 1
                Device Current            : 1
                Device Max                : 4
                Host Max                  : 4
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 10000 KB/s
        Atomic Caps Inbound               : N/A
        Atomic Caps Outbound              : N/A
    Fan Speed                             : 0 %
    Performance State                     : P8
    Clocks Throttle Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 10240 MiB
        Reserved                          : 233 MiB
        Used                              : 623 MiB
        Free                              : 9383 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 8 MiB
        Free                              : 248 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 1 %
        Memory                            : 46 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 30 C
        GPU Shutdown Temp                 : 98 C
        GPU Slowdown Temp                 : 95 C
        GPU Max Operating Temp            : 93 C
        GPU Target Temperature            : 83 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 30.15 W
        Power Limit                       : 370.00 W
        Default Power Limit               : 370.00 W
        Enforced Power Limit              : 370.00 W
        Min Power Limit                   : 100.00 W
        Max Power Limit                   : 450.00 W
    Clocks
        Graphics                          : 210 MHz
        SM                                : 210 MHz
        Memory                            : 405 MHz
        Video                             : 555 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Deferred Clocks
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2130 MHz
        SM                                : 2130 MHz
        Memory                            : 9501 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : 743.750 mV
    Fabric
        State                             : N/A
        Status                            : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 2241
            Type                          : G
            Name                          : /usr/lib/xorg/Xorg
            Used GPU Memory               : 276 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 2386
            Type                          : G
            Name                          : /usr/bin/gnome-shell
            Used GPU Memory               : 134 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 4074
            Type                          : G
            Name                          : /usr/lib/firefox/firefox
            Used GPU Memory               : 190 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 5421
            Type                          : G
            Name                          : /opt/docker-desktop/Docker Desktop --type=gpu-process --enable-crashpad --enable-crash-reporter=d062e744-c844-4ecd-9553-0ff5cc70f399,no_channel --user-data-dir=/home/cooper/.config/Docker Desktop --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAAAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,14049627346808659855,13096785095356484666,131072 --disable-features=SpareRendererForSitePerProcess
            Used GPU Memory               : 18 MiB
  • Docker version from docker version
Client: Docker Engine - Community
 Cloud integration: v1.0.29
 Version:           20.10.22
 API version:       1.41
 Go version:        go1.18.9
 Git commit:        3a2c30b
 Built:             Thu Dec 15 22:28:08 2022
 OS/Arch:           linux/amd64
 Context:           desktop-linux
 Experimental:      true

Server: Docker Desktop 4.15.0 (93002)
 Engine:
  Version:          20.10.21
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.7
  Git commit:       3056208
  Built:            Tue Oct 25 18:00:19 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.10
  GitCommit:        770bd0108c32f3fb5c73ae1264f7e503fe7b2661
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
un  libgldispatch0-nvidia                      <none>                 <none>       (no description available)
ii  libnvidia-cfg1-525:amd64                   525.60.13-0ubuntu1     amd64        NVIDIA binary OpenGL/GLX configuration library
un  libnvidia-cfg1-any                         <none>                 <none>       (no description available)
un  libnvidia-common                           <none>                 <none>       (no description available)
ii  libnvidia-common-525                       525.60.13-0ubuntu1     all          Shared files used by the NVIDIA libraries
ii  libnvidia-compute-525:amd64                525.60.13-0ubuntu1     amd64        NVIDIA libcompute package
ii  libnvidia-compute-525:i386                 525.60.13-0ubuntu1     i386         NVIDIA libcompute package
ii  libnvidia-container-tools                  1.11.0-1               amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64                 1.11.0-1               amd64        NVIDIA container runtime library
un  libnvidia-decode                           <none>                 <none>       (no description available)
ii  libnvidia-decode-525:amd64                 525.60.13-0ubuntu1     amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-525:i386                  525.60.13-0ubuntu1     i386         NVIDIA Video Decoding runtime libraries
un  libnvidia-encode                           <none>                 <none>       (no description available)
ii  libnvidia-encode-525:amd64                 525.60.13-0ubuntu1     amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-525:i386                  525.60.13-0ubuntu1     i386         NVENC Video Encoding runtime library
un  libnvidia-extra                            <none>                 <none>       (no description available)
ii  libnvidia-extra-525:amd64                  525.60.13-0ubuntu1     amd64        Extra libraries for the NVIDIA driver
un  libnvidia-fbc1                             <none>                 <none>       (no description available)
ii  libnvidia-fbc1-525:amd64                   525.60.13-0ubuntu1     amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-525:i386                    525.60.13-0ubuntu1     i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
un  libnvidia-gl                               <none>                 <none>       (no description available)
ii  libnvidia-gl-525:amd64                     525.60.13-0ubuntu1     amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-525:i386                      525.60.13-0ubuntu1     i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un  libnvidia-ml1                              <none>                 <none>       (no description available)
rc  linux-modules-nvidia-525-5.15.0-56-generic 5.15.0-56.62~20.04.1+1 amd64        Linux kernel nvidia modules for version 5.15.0-56
ii  linux-objects-nvidia-525-5.15.0-56-generic 5.15.0-56.62~20.04.1+1 amd64        Linux kernel nvidia modules for version 5.15.0-56 (objects)
ii  linux-signatures-nvidia-5.15.0-56-generic  5.15.0-56.62~20.04.1+1 amd64        Linux kernel signatures for nvidia modules for version 5.15.0-56-generic
un  nvidia-384                                 <none>                 <none>       (no description available)
un  nvidia-390                                 <none>                 <none>       (no description available)
un  nvidia-common                              <none>                 <none>       (no description available)
ii  nvidia-compute-utils-525                   525.60.13-0ubuntu1     amd64        NVIDIA compute utilities
un  nvidia-container-runtime                   <none>                 <none>       (no description available)
un  nvidia-container-runtime-hook              <none>                 <none>       (no description available)
ii  nvidia-container-toolkit                   1.11.0-1               amd64        NVIDIA Container toolkit
ii  nvidia-container-toolkit-base              1.11.0-1               amd64        NVIDIA Container Toolkit Base
ii  nvidia-dkms-525                            525.60.13-0ubuntu1     amd64        NVIDIA DKMS package
un  nvidia-dkms-kernel                         <none>                 <none>       (no description available)
un  nvidia-docker                              <none>                 <none>       (no description available)
ii  nvidia-docker2                             2.11.0-1               all          nvidia-docker CLI wrapper
ii  nvidia-driver-525                          525.60.13-0ubuntu1     amd64        NVIDIA driver metapackage
un  nvidia-driver-binary                       <none>                 <none>       (no description available)
un  nvidia-kernel-common                       <none>                 <none>       (no description available)
ii  nvidia-kernel-common-525                   525.60.13-0ubuntu1     amd64        Shared files used with the kernel module
un  nvidia-kernel-open                         <none>                 <none>       (no description available)
un  nvidia-kernel-open-525                     <none>                 <none>       (no description available)
un  nvidia-kernel-source                       <none>                 <none>       (no description available)
ii  nvidia-kernel-source-525                   525.60.13-0ubuntu1     amd64        NVIDIA kernel source package
un  nvidia-legacy-304xx-vdpau-driver           <none>                 <none>       (no description available)
un  nvidia-legacy-340xx-vdpau-driver           <none>                 <none>       (no description available)
un  nvidia-libopencl1-dev                      <none>                 <none>       (no description available)
ii  nvidia-modprobe                            525.60.13-0ubuntu1     amd64        Load the NVIDIA kernel driver and create device files
un  nvidia-opencl-icd                          <none>                 <none>       (no description available)
un  nvidia-persistenced                        <none>                 <none>       (no description available)
ii  nvidia-prime                               0.8.16~0.20.04.2       all          Tools to enable NVIDIA's Prime
ii  nvidia-settings                            525.60.13-0ubuntu1     amd64        Tool for configuring the NVIDIA graphics driver
un  nvidia-settings-binary                     <none>                 <none>       (no description available)
un  nvidia-smi                                 <none>                 <none>       (no description available)
un  nvidia-utils                               <none>                 <none>       (no description available)
ii  nvidia-utils-525                           525.60.13-0ubuntu1     amd64        NVIDIA driver support binaries
un  nvidia-vdpau-driver                        <none>                 <none>       (no description available)
ii  xserver-xorg-video-nvidia-525              525.60.13-0ubuntu1     amd64        NVIDIA binary Xorg driver
  • NVIDIA container library version from nvidia-container-cli -V
cli-version: 1.11.0
lib-version: 1.11.0
build date: 2022-09-06T09:21+00:00
build revision: c8f267be0bac1c654d59ad4ea5df907141149977
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
  • NVIDIA container library logs (see troubleshooting)
    No logs produced.
  • Docker command, image and tag used
    docker run --rm --gpus all nvidia/cuda:12.0.0-devel-ubuntu22.04 nvidia-smi
@AIWintermuteAI
Copy link

Same issue.

@javiplav
Copy link

Same issue

@mohamedleithy
Copy link

same issue

1 similar comment
@dssp301
Copy link

dssp301 commented Jan 30, 2023

same issue

@nicholasmullikin
Copy link

nicholasmullikin commented Jan 31, 2023

I had the same problem and followed the same debug steps and saw an almost identical setup:

  • Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
-- WARNING, the following logs are for debugging purposes only --

I0131 19:03:41.187880 64600 nvc.c:376] initializing library context (version=1.11.0, build=c8f267be0bac1c654d59ad4ea5df907141149977)
I0131 19:03:41.187897 64600 nvc.c:350] using root /
I0131 19:03:41.187899 64600 nvc.c:351] using ldcache /etc/ld.so.cache
I0131 19:03:41.187902 64600 nvc.c:352] using unprivileged user 1000:1000
I0131 19:03:41.187910 64600 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0131 19:03:41.187956 64600 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment
W0131 19:03:41.448196 64601 nvc.c:273] failed to set inheritable capabilities
W0131 19:03:41.448215 64601 nvc.c:274] skipping kernel modules load due to failure
I0131 19:03:41.448469 64602 rpc.c:71] starting driver rpc service
I0131 19:03:41.452178 64604 rpc.c:71] starting nvcgo rpc service
I0131 19:03:41.452609 64600 nvc_info.c:766] requesting driver information with ''
I0131 19:03:41.453353 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.525.85.12
I0131 19:03:41.453383 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.525.85.12
I0131 19:03:41.453396 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.525.85.12
I0131 19:03:41.453409 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.525.85.12
I0131 19:03:41.453428 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.525.85.12
I0131 19:03:41.453452 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.525.85.12
I0131 19:03:41.453473 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.525.85.12
I0131 19:03:41.453488 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.525.85.12
I0131 19:03:41.453510 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.525.85.12
I0131 19:03:41.453523 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.525.85.12
I0131 19:03:41.453538 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.525.85.12
I0131 19:03:41.453555 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.525.85.12
I0131 19:03:41.453575 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.525.85.12
I0131 19:03:41.453593 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.525.85.12
I0131 19:03:41.453606 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.525.85.12
I0131 19:03:41.453617 64600 nvc_info.c:175] skipping /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.510.108.03
I0131 19:03:41.453631 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.525.85.12
I0131 19:03:41.453649 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.525.85.12
I0131 19:03:41.453668 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.525.85.12
I0131 19:03:41.453816 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcudadebugger.so.525.85.12
I0131 19:03:41.453828 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.525.85.12
I0131 19:03:41.453967 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.525.85.12
I0131 19:03:41.453981 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.525.85.12
I0131 19:03:41.453998 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.525.85.12
I0131 19:03:41.454015 64600 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.525.85.12
I0131 19:03:41.454045 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.525.85.12
I0131 19:03:41.454058 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.525.85.12
I0131 19:03:41.454076 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.525.85.12
I0131 19:03:41.454099 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.525.85.12
I0131 19:03:41.454119 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.525.85.12
I0131 19:03:41.454146 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.525.85.12
I0131 19:03:41.454162 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.525.85.12
I0131 19:03:41.454179 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.525.85.12
I0131 19:03:41.454197 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.525.85.12
I0131 19:03:41.454224 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.525.85.12
I0131 19:03:41.454250 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.525.85.12
I0131 19:03:41.454267 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.525.85.12
I0131 19:03:41.454280 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.525.85.12
I0131 19:03:41.454312 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libcuda.so.525.85.12
I0131 19:03:41.454347 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.525.85.12
I0131 19:03:41.454367 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.525.85.12
I0131 19:03:41.454385 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.525.85.12
I0131 19:03:41.454403 64600 nvc_info.c:173] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.525.85.12
W0131 19:03:41.454415 64600 nvc_info.c:399] missing library libnvidia-nscq.so
W0131 19:03:41.454419 64600 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so
W0131 19:03:41.454422 64600 nvc_info.c:399] missing library libnvidia-pkcs11.so
W0131 19:03:41.454425 64600 nvc_info.c:399] missing library libvdpau_nvidia.so
W0131 19:03:41.454428 64600 nvc_info.c:399] missing library libnvidia-ifr.so
W0131 19:03:41.454431 64600 nvc_info.c:399] missing library libnvidia-cbl.so
W0131 19:03:41.454433 64600 nvc_info.c:403] missing compat32 library libnvidia-cfg.so
W0131 19:03:41.454436 64600 nvc_info.c:403] missing compat32 library libnvidia-nscq.so
W0131 19:03:41.454439 64600 nvc_info.c:403] missing compat32 library libcudadebugger.so
W0131 19:03:41.454442 64600 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so
W0131 19:03:41.454445 64600 nvc_info.c:403] missing compat32 library libnvidia-allocator.so
W0131 19:03:41.454449 64600 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so
W0131 19:03:41.454451 64600 nvc_info.c:403] missing compat32 library libnvidia-ngx.so
W0131 19:03:41.454454 64600 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so
W0131 19:03:41.454457 64600 nvc_info.c:403] missing compat32 library libnvidia-ifr.so
W0131 19:03:41.454460 64600 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so
W0131 19:03:41.454462 64600 nvc_info.c:403] missing compat32 library libnvoptix.so
W0131 19:03:41.454465 64600 nvc_info.c:403] missing compat32 library libnvidia-cbl.so
I0131 19:03:41.454913 64600 nvc_info.c:299] selecting /usr/bin/nvidia-smi
I0131 19:03:41.454923 64600 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump
I0131 19:03:41.454932 64600 nvc_info.c:299] selecting /usr/bin/nvidia-persistenced
I0131 19:03:41.454943 64600 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-control
I0131 19:03:41.454949 64600 nvc_info.c:299] selecting /usr/bin/nvidia-cuda-mps-server
W0131 19:03:41.454992 64600 nvc_info.c:425] missing binary nv-fabricmanager
W0131 19:03:41.455003 64600 nvc_info.c:349] missing firmware path /lib/firmware/nvidia/525.85.12/gsp.bin
I0131 19:03:41.455014 64600 nvc_info.c:529] listing device /dev/nvidiactl
I0131 19:03:41.455016 64600 nvc_info.c:529] listing device /dev/nvidia-uvm
I0131 19:03:41.455019 64600 nvc_info.c:529] listing device /dev/nvidia-uvm-tools
I0131 19:03:41.455021 64600 nvc_info.c:529] listing device /dev/nvidia-modeset
I0131 19:03:41.455032 64600 nvc_info.c:343] listing ipc path /run/nvidia-persistenced/socket
W0131 19:03:41.455041 64600 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket
W0131 19:03:41.455047 64600 nvc_info.c:349] missing ipc path /tmp/nvidia-mps
I0131 19:03:41.455050 64600 nvc_info.c:822] requesting device information with ''
I0131 19:03:41.460806 64600 nvc_info.c:713] listing device /dev/nvidia0 (GPU-a0ccb68c-9ccc-019a-ab1a-c2a9fa0beaf7 at 00000000:01:00.0)
NVRM version:   525.85.12
CUDA version:   12.0

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 3050 Ti Laptop GPU
Brand:          GeForce
GPU UUID:       GPU-a0ccb68c-9ccc-019a-ab1a-c2a9fa0beaf7
Bus Location:   00000000:01:00.0
Architecture:   8.6
I0131 19:03:41.460827 64600 nvc.c:434] shutting down library context
I0131 19:03:41.460846 64604 rpc.c:95] terminating nvcgo rpc service
I0131 19:03:41.461127 64600 rpc.c:135] nvcgo rpc service terminated successfully
I0131 19:03:41.462303 64602 rpc.c:95] terminating driver rpc service
I0131 19:03:41.462393 64600 rpc.c:135] driver rpc service terminated successfully

  • Kernel version from uname -a
    Linux nick-XPS-15-9520 5.15.0-57-generic NVIDIA/nvidia-docker#63-Ubuntu SMP Thu Nov 24 13:43:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

  • Any relevant kernel output lines from dmesg

  • Driver information from nvidia-smi -a

==============NVSMI LOG==============

Timestamp                                 : Tue Jan 31 14:06:02 2023
Driver Version                            : 525.85.12
CUDA Version                              : 12.0

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Product Name                          : NVIDIA GeForce RTX 3050 Ti Laptop GPU
    Product Brand                         : GeForce
    Product Architecture                  : Ampere
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Enabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-a0ccb68c-9ccc-019a-ab1a-c2a9fa0beaf7
    Minor Number                          : 0
    VBIOS Version                         : 94.07.5B.00.8A
    MultiGPU Board                        : No
    Board ID                              : 0x100
    Board Part Number                     : N/A
    GPU Part Number                       : 25A0-775-A1
    Module ID                             : 1
    Inforom Version
        Image Version                     : G001.0000.03.03
        OEM Object                        : 2.0
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x01
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x25A010DE
        Bus Id                            : 00000000:01:00.0
        Sub System Id                     : 0x0B191028
        GPU Link Info
            PCIe Generation
                Max                       : 4
                Current                   : 4
                Device Current            : 4
                Device Max                : 4
                Host Max                  : 4
            Link Width
                Max                       : 16x
                Current                   : 8x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 2790000 KB/s
        Rx Throughput                     : 1000 KB/s
        Atomic Caps Inbound               : N/A
        Atomic Caps Outbound              : N/A
    Fan Speed                             : N/A
    Performance State                     : P0
    Clocks Throttle Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 4096 MiB
        Reserved                          : 191 MiB
        Used                              : 1924 MiB
        Free                              : 1979 MiB
    BAR1 Memory Usage
        Total                             : 4096 MiB
        Used                              : 6 MiB
        Free                              : 4090 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 14 %
        Memory                            : 7 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 52 C
        GPU T.Limit Temp                  : N/A
        GPU Shutdown Temp                 : 100 C
        GPU Slowdown Temp                 : 97 C
        GPU Max Operating Temp            : 75 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 9.82 W
        Power Limit                       : 4294967.50 W
        Default Power Limit               : 35.00 W
        Enforced Power Limit              : 40.00 W
        Min Power Limit                   : 1.00 W
        Max Power Limit                   : 45.00 W
    Clocks
        Graphics                          : 1177 MHz
        SM                                : 1177 MHz
        Memory                            : 5500 MHz
        Video                             : 1035 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Deferred Clocks
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2100 MHz
        SM                                : 2100 MHz
        Memory                            : 5501 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : 675.000 mV
    Fabric
        State                             : N/A
        Status                            : N/A
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 53746
            Type                          : G
            Name                          : /usr/lib/xorg/Xorg
            Used GPU Memory               : 1178 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 54112
            Type                          : G
            Name                          : /usr/bin/gnome-shell
            Used GPU Memory               : 180 MiB
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 56819
            Type                          : G
            Name                          : /opt/docker-desktop/Docker Desktop --type=gpu-process --enable-crashpad --enable-crash-reporter=0a27583e-c64d-4a8f-89d4-a99e6504ccb3,no_channel --user-data-dir=/home/nick/.config/Docker Desktop --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAAAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,5899541367699722517,14699788830379162664,131072 --disable-features=SpareRendererForSitePerProcess
            Used GPU Memory               : 1 MiB
  • Docker version from docker version
Client: Docker Engine - Community
 Cloud integration: v1.0.29
 Version:           20.10.23
 API version:       1.41
 Go version:        go1.18.10
 Git commit:        7155243
 Built:             Thu Jan 19 17:45:08 2023
 OS/Arch:           linux/amd64
 Context:           desktop-linux
 Experimental:      true

Server: Docker Desktop 4.16.2 (95914)
 Engine:
  Version:          20.10.22
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.9
  Git commit:       42c8b31
  Built:            Thu Dec 15 22:26:14 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                              Version                     Architecture Description
+++-=================================================-===========================-============-========================================================================
un  libgldispatch0-nvidia                             <none>                      <none>       (no description available)
ii  libnvidia-cfg1-525:amd64                          525.85.12-0ubuntu1          amd64        NVIDIA binary OpenGL/GLX configuration library
un  libnvidia-cfg1-any                                <none>                      <none>       (no description available)
un  libnvidia-common                                  <none>                      <none>       (no description available)
ii  libnvidia-common-525                              525.85.12-0ubuntu1          all          Shared files used by the NVIDIA libraries
un  libnvidia-compute                                 <none>                      <none>       (no description available)
rc  libnvidia-compute-510:amd64                       510.108.03-0ubuntu0.22.04.1 amd64        NVIDIA libcompute package
rc  libnvidia-compute-515:amd64                       515.86.01-0ubuntu0.22.04.1  amd64        NVIDIA libcompute package
rc  libnvidia-compute-515-server:amd64                515.65.01-0ubuntu0.22.04.1  amd64        NVIDIA libcompute package
rc  libnvidia-compute-520:amd64                       520.56.06-0ubuntu0.22.04.1  amd64        NVIDIA libcompute package
ii  libnvidia-compute-525:amd64                       525.85.12-0ubuntu1          amd64        NVIDIA libcompute package
ii  libnvidia-compute-525:i386                        525.85.12-0ubuntu1          i386         NVIDIA libcompute package
ii  libnvidia-container-tools                         1.11.0-1                    amd64        NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64                        1.11.0-1                    amd64        NVIDIA container runtime library
un  libnvidia-decode                                  <none>                      <none>       (no description available)
ii  libnvidia-decode-525:amd64                        525.85.12-0ubuntu1          amd64        NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-525:i386                         525.85.12-0ubuntu1          i386         NVIDIA Video Decoding runtime libraries
un  libnvidia-encode                                  <none>                      <none>       (no description available)
ii  libnvidia-encode-525:amd64                        525.85.12-0ubuntu1          amd64        NVENC Video Encoding runtime library
ii  libnvidia-encode-525:i386                         525.85.12-0ubuntu1          i386         NVENC Video Encoding runtime library
un  libnvidia-encode1                                 <none>                      <none>       (no description available)
un  libnvidia-extra                                   <none>                      <none>       (no description available)
ii  libnvidia-extra-525:amd64                         525.85.12-0ubuntu1          amd64        Extra libraries for the NVIDIA driver
un  libnvidia-fbc1                                    <none>                      <none>       (no description available)
ii  libnvidia-fbc1-525:amd64                          525.85.12-0ubuntu1          amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-525:i386                           525.85.12-0ubuntu1          i386         NVIDIA OpenGL-based Framebuffer Capture runtime library
un  libnvidia-gl                                      <none>                      <none>       (no description available)
ii  libnvidia-gl-525:amd64                            525.85.12-0ubuntu1          amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-525:i386                             525.85.12-0ubuntu1          i386         NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un  libnvidia-ml1                                     <none>                      <none>       (no description available)
rc  linux-modules-nvidia-515-5.15.0-47-generic        5.15.0-47.51                amd64        Linux kernel nvidia modules for version 5.15.0-47
rc  linux-modules-nvidia-515-5.15.0-48-generic        5.15.0-48.54                amd64        Linux kernel nvidia modules for version 5.15.0-48
rc  linux-modules-nvidia-515-5.15.0-50-generic        5.15.0-50.56                amd64        Linux kernel nvidia modules for version 5.15.0-50
rc  linux-modules-nvidia-515-5.15.0-52-generic        5.15.0-52.58+1              amd64        Linux kernel nvidia modules for version 5.15.0-52
rc  linux-modules-nvidia-515-server-5.15.0-52-generic 5.15.0-52.58+1              amd64        Linux kernel nvidia modules for version 5.15.0-52
rc  linux-modules-nvidia-515-server-5.15.0-53-generic 5.15.0-53.59+1              amd64        Linux kernel nvidia modules for version 5.15.0-53
rc  linux-modules-nvidia-520-5.15.0-52-generic        5.15.0-52.58+1              amd64        Linux kernel nvidia modules for version 5.15.0-52
rc  linux-modules-nvidia-525-5.15.0-57-generic        5.15.0-57.63+1              amd64        Linux kernel nvidia modules for version 5.15.0-57
rc  linux-modules-nvidia-525-5.15.0-58-generic        5.15.0-58.64                amd64        Linux kernel nvidia modules for version 5.15.0-58
rc  linux-objects-nvidia-515-5.15.0-47-generic        5.15.0-47.51                amd64        Linux kernel nvidia modules for version 5.15.0-47 (objects)
rc  linux-objects-nvidia-515-5.15.0-48-generic        5.15.0-48.54                amd64        Linux kernel nvidia modules for version 5.15.0-48 (objects)
rc  linux-objects-nvidia-515-5.15.0-50-generic        5.15.0-50.56+1              amd64        Linux kernel nvidia modules for version 5.15.0-50 (objects)
rc  linux-objects-nvidia-515-5.15.0-52-generic        5.15.0-52.58+1              amd64        Linux kernel nvidia modules for version 5.15.0-52 (objects)
rc  linux-objects-nvidia-515-server-5.15.0-52-generic 5.15.0-52.58+1              amd64        Linux kernel nvidia modules for version 5.15.0-52 (objects)
rc  linux-objects-nvidia-515-server-5.15.0-53-generic 5.15.0-53.59+1              amd64        Linux kernel nvidia modules for version 5.15.0-53 (objects)
rc  linux-objects-nvidia-515-server-5.15.0-56-generic 5.15.0-56.62+1              amd64        Linux kernel nvidia modules for version 5.15.0-56 (objects)
rc  linux-objects-nvidia-520-5.15.0-52-generic        5.15.0-52.58+1              amd64        Linux kernel nvidia modules for version 5.15.0-52 (objects)
ii  linux-objects-nvidia-525-5.15.0-57-generic        5.15.0-57.63+1              amd64        Linux kernel nvidia modules for version 5.15.0-57 (objects)
ii  linux-objects-nvidia-525-5.15.0-58-generic        5.15.0-58.64+1              amd64        Linux kernel nvidia modules for version 5.15.0-58 (objects)
un  linux-signatures-nvidia-5.15.0-47-generic         <none>                      <none>       (no description available)
un  linux-signatures-nvidia-5.15.0-48-generic         <none>                      <none>       (no description available)
un  linux-signatures-nvidia-5.15.0-50-generic         <none>                      <none>       (no description available)
un  linux-signatures-nvidia-5.15.0-52-generic         <none>                      <none>       (no description available)
un  linux-signatures-nvidia-5.15.0-53-generic         <none>                      <none>       (no description available)
ii  linux-signatures-nvidia-5.15.0-57-generic         5.15.0-57.63+1              amd64        Linux kernel signatures for nvidia modules for version 5.15.0-57-generic
ii  linux-signatures-nvidia-5.15.0-58-generic         5.15.0-58.64+1              amd64        Linux kernel signatures for nvidia modules for version 5.15.0-58-generic
un  nvidia-384                                        <none>                      <none>       (no description available)
un  nvidia-390                                        <none>                      <none>       (no description available)
un  nvidia-common                                     <none>                      <none>       (no description available)
ii  nvidia-compute-utils-525                          525.85.12-0ubuntu1          amd64        NVIDIA compute utilities
ii  nvidia-container-runtime                          3.11.0-1                    all          NVIDIA container runtime
un  nvidia-container-runtime-hook                     <none>                      <none>       (no description available)
ii  nvidia-container-toolkit                          1.11.0-1                    amd64        NVIDIA Container toolkit
ii  nvidia-container-toolkit-base                     1.11.0-1                    amd64        NVIDIA Container Toolkit Base
un  nvidia-cuda-dev                                   <none>                      <none>       (no description available)
un  nvidia-cuda-doc                                   <none>                      <none>       (no description available)
un  nvidia-cuda-gdb                                   <none>                      <none>       (no description available)
rc  nvidia-cuda-toolkit                               11.5.1-1ubuntu1             amd64        NVIDIA CUDA development toolkit
un  nvidia-cuda-toolkit-doc                           <none>                      <none>       (no description available)
ii  nvidia-dkms-525                                   525.85.12-0ubuntu1          amd64        NVIDIA DKMS package
un  nvidia-dkms-kernel                                <none>                      <none>       (no description available)
un  nvidia-docker                                     <none>                      <none>       (no description available)
ii  nvidia-docker2                                    2.11.0-1                    all          nvidia-docker CLI wrapper
ii  nvidia-driver-525                                 525.85.12-0ubuntu1          amd64        NVIDIA driver metapackage
un  nvidia-driver-binary                              <none>                      <none>       (no description available)
un  nvidia-driver-libs                                <none>                      <none>       (no description available)
un  nvidia-kernel-common                              <none>                      <none>       (no description available)
un  nvidia-kernel-common-515                          <none>                      <none>       (no description available)
un  nvidia-kernel-common-515-server                   <none>                      <none>       (no description available)
un  nvidia-kernel-common-520                          <none>                      <none>       (no description available)
ii  nvidia-kernel-common-525                          525.85.12-0ubuntu1          amd64        Shared files used with the kernel module
un  nvidia-kernel-open                                <none>                      <none>       (no description available)
un  nvidia-kernel-open-525                            <none>                      <none>       (no description available)
un  nvidia-kernel-source                              <none>                      <none>       (no description available)
ii  nvidia-kernel-source-525                          525.85.12-0ubuntu1          amd64        NVIDIA kernel source package
un  nvidia-libopencl1-dev                             <none>                      <none>       (no description available)
ii  nvidia-modprobe                                   525.85.12-0ubuntu1          amd64        Load the NVIDIA kernel driver and create device files
un  nvidia-opencl-dev                                 <none>                      <none>       (no description available)
un  nvidia-opencl-icd                                 <none>                      <none>       (no description available)
un  nvidia-persistenced                               <none>                      <none>       (no description available)
ii  nvidia-prime                                      0.8.17.1                    all          Tools to enable NVIDIA's Prime
un  nvidia-profiler                                   <none>                      <none>       (no description available)
ii  nvidia-settings                                   525.85.12-0ubuntu1          amd64        Tool for configuring the NVIDIA graphics driver
un  nvidia-settings-binary                            <none>                      <none>       (no description available)
un  nvidia-smi                                        <none>                      <none>       (no description available)
un  nvidia-utils                                      <none>                      <none>       (no description available)
ii  nvidia-utils-525                                  525.85.12-0ubuntu1          amd64        NVIDIA driver support binaries
un  nvidia-visual-profiler                            <none>                      <none>       (no description available)
un  nvidia-vulkan-icd                                 <none>                      <none>       (no description available)
ii  xserver-xorg-video-nvidia-525                     525.85.12-0ubuntu1          amd64        NVIDIA binary Xorg driver
  • NVIDIA container library version from nvidia-container-cli -V
cli-version: 1.11.0
lib-version: 1.11.0
build date: 2022-09-06T09:21+00:00
build revision: c8f267be0bac1c654d59ad4ea5df907141149977
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
  • NVIDIA container library logs (see troubleshooting)
    None
  • Docker command, image and tag used
    docker run --rm --gpus all nvidia/cuda:12.0.0-devel-ubuntu22.04 nvidia-smi or
    docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown

This looks quite related to #154 as well as the linked docker forum thread

@mazispider
Copy link

I encountered same error running nvidia docker. I made my docker run with rootless privileges and set no-cgroups=true in config.toml. using sudo to run nvidia docker somethimes retruns nvidia-smi properly but sometimes OCI runtime create failed. but every time I launch pytorch with or without sudo which utilizes nvidia driver, returns an error getDeviceCount().

@fidesachates
Copy link

Would love to see this solved

@anupambhatnagar
Copy link

same error

@elezar
Copy link
Member

elezar commented May 31, 2023

Docker Desktop is not currently supported by the NVIDIA Container Stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants