Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Cannot get nvidia-docker2 to run #1742

Closed
3 tasks
Overcraft90 opened this issue Mar 20, 2023 · 7 comments
Closed
3 tasks

Cannot get nvidia-docker2 to run #1742

Overcraft90 opened this issue Mar 20, 2023 · 7 comments

Comments

@Overcraft90
Copy link

The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense.

Also, before reporting a new issue, please make sure that:


1. Issue or feature description

After following the steps here, I'm not able to run my NVIDIA Container Toolkit properly

2. Steps to reproduce the issue

At first, I got my Docker Desktop running on Ubuntu 22.04 then I did the following:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container//ubuntu22.04/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

sudo apt-get install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker

sudo systemctl restart docker
And until here everything was fine, but then when running
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
I got prompted the following message

docker: Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.

Please, any help is appreciated, as #838 seems to do not solve my issue...

3. Information to attach (optional if deemed irrelevant)

  • [nvidia-container-cli.log] Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
  • [5.19.0-35-generic] Kernel version from uname -a
  • Any relevant kernel output lines from dmesg
  • [nvidia-smi.log] Driver information from nvidia-smi -a
  • [ 23.0.1] Docker version from docker version (but I got also permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied Client: Docker Engine - Community)
  • [nvidia-packages.log] NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
  • [nvidia-container-library.log] NVIDIA container library version from nvidia-container-cli -V
  • NVIDIA container library logs (see troubleshooting)
  • Docker command, image and tag used
@bsamadi
Copy link

bsamadi commented Mar 21, 2023

Same here

@elezar
Copy link
Member

elezar commented Mar 21, 2023

@Overcraft90 @bsamadi could either of you provide the contents of your /etc/docker/daemon.json files?

Also, how was docker installed? If it was installed using snap if may be using a different config file.

@Overcraft90
Copy link
Author

Overcraft90 commented Mar 21, 2023

Sure, here is the content of mine

{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

In my case, at least, it was installed following the procedure listed here.

@elezar
Copy link
Member

elezar commented Mar 21, 2023

OK, so you're running Docker Desktop? As far as I know this means that docker is running in a virtual machine and is not configured through the /etc/docker/daemon.json config.

@Overcraft90
Copy link
Author

Ah I see, I deemed a bit more safe than the conventional Docker. What should I do in this situation?

@bsamadi
Copy link

bsamadi commented Mar 21, 2023

Thank you @elezar for the point Docker Desktop running in a virtual machine. I uninstalled Docker Desktop and only installed Docker Engine. Everything now works as expected.

@Overcraft90
Copy link
Author

Nice, thank you so much @elezar and @bsamadi. This solved the problem for me as well; I just have to switch to Docker Engine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants