Skip to content
This repository has been archived by the owner on Oct 27, 2023. It is now read-only.

Running nvidia-container-runtime with podman is blowing up. #85

Closed
rhatdan opened this issue Oct 18, 2019 · 90 comments
Closed

Running nvidia-container-runtime with podman is blowing up. #85

rhatdan opened this issue Oct 18, 2019 · 90 comments

Comments

@rhatdan
Copy link

rhatdan commented Oct 18, 2019

  1. Issue or feature description
    rootless and rootful podman does not work with the nvidia plugin

  2. Steps to reproduce the issue
    Install the nvidia plugin, configure it to run with podman
    execute the podman command and check if the devices is configured correctly.

  3. Information to attach (optional if deemed irrelevant)

    Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
    Kernel version from uname -a
    Fedora 30 and later
    Any relevant kernel output lines from dmesg
    Driver information from nvidia-smi -a
    Docker version from docker version
    NVIDIA packages version from dpkg -l 'nvidia' or rpm -qa 'nvidia'
    NVIDIA container library version from nvidia-container-cli -V
    NVIDIA container library logs (see troubleshooting)
    Docker command, image and tag used

I am reporting this based on other users complaining. This is what they said.

We discovered that the ubuntu 18.04 machine needed a configuration change to get rootless working with nvidia:
"no-cgroups = true" was set in /etc/nvidia-container-runtime/config.toml
Unfortunately this config change did not work on Centos 7, but it did change the rootless error to:
nvidia-container-cli: initialization error: cuda error: unknown error\\n\"""

This config change breaks podman running from root, with the error:
Failed to initialize NVML: Unknown Error

Interestingly, root on ubuntu gets the same error even though rootless works.

@rhatdan
Copy link
Author

rhatdan commented Oct 18, 2019

The Podman team would like to work with you guys to get this to work well in both root full and rootless containers if possible. But we need someone to work with.

@rhatdan
Copy link
Author

rhatdan commented Oct 18, 2019

@mheon @baude FYI

@zvonkok
Copy link

zvonkok commented Oct 18, 2019

@sjug FYI

@RenaudWasTaken
Copy link
Contributor

Hello!

@rhatdan do you mind filling the following issue template: https://github.com/NVIDIA/nvidia-docker/blob/master/.github/ISSUE_TEMPLATE.md

Thanks!

@nvjmayo
Copy link
Contributor

nvjmayo commented Oct 18, 2019

I can work with the podman team.

@rhatdan
Copy link
Author

rhatdan commented Oct 18, 2019

@hholst80 FYI

@rhatdan
Copy link
Author

rhatdan commented Oct 18, 2019

containers/podman#3659

@eaepstein
Copy link

@nvjmayo Thanks for the suggestions. Some good news and less good.

This works rootless:
podman run --rm --hooks-dir /usr/share/containers/oci/hooks.d nvcr.io/nvidia/cuda nvidia-smi
The same command continues to fail with the image: docker.io/nvidia/cuda

In fact rootless works with or without /usr/share/containers/oci/hooks.d/01-nvhook.json installed using the image: nvcr.io/nvidia/cuda

Running as root continues to fail when no-cgroups = true for either container, returning:
Failed to initialize NVML: Unknown Error

@rhatdan
Copy link
Author

rhatdan commented Oct 20, 2019

Strange I would not expect podman to run a hook that did not have a json file describing the hook.

@nvjmayo
Copy link
Contributor

nvjmayo commented Oct 22, 2019

@eaepstein I'm still struggling to reproduce the issue you see. Using docker.io/nvidia/cuda also works for me with the hooks dir.

$ podman run --rm --hooks-dir /usr/share/containers/oci/hooks.d/ docker.io/nvidia/cuda nvidia-smi
Tue Oct 22 21:35:44 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 710      Off  | 00000000:65:00.0 N/A |                  N/A |
| 50%   38C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

without the hook I would expect to see a failure roughly like:

Error: time="2019-10-22T14:35:14-07:00" level=error msg="container_linux.go:346: starting container process caused \"exec: \\\"nvidia-smi\\\": executable file not found in $PATH\""
container_linux.go:346: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": OCI runtime command not found error

This is because the libraries and tools get installed by the hook in order to match the host drivers. (an unfortunate limitation of tightly coupled driver+library releases)

I think there is a configuration issue and not an issue of the container image (docker.io/nvidia/cuda vs nvcr.io/nvidia/cuda).

Reviewing my earlier posts, I recommend changing my 01-nvhook.json and remove the NVIDIA_REQUIRE_CUDA=cuda>=10.1 from it. My assumption is everyone has the latest CUDA install, which was kind of a silly assumption on my part. The CUDA version doesn't have to be specified, and you can leave this environment variable out of your set up. It was an artifact of my earlier experiments.

@eaepstein
Copy link

@nvjmayo we started from scratch with a new machine (CentOS Linux release 7.7.1908) and both docker.io and nvcr.io images are working for us now too. And --hooks-dir must now be specified for both to work. Thanks for the help!

@eaepstein
Copy link

@rhatdan @nvjmayo Turns out that getting rootless podman working with nvidia on centos 7 is a bit more complicated, at least for us.

Here is our scenario on brand new centos 7.7 machine

  1. run nvidia-smi with rootless podman
    1.result: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused "process_linux.go:413: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: cuda error: unknown error\\n\""

  2. run podman with user=root
    2.result: nvidia-smi works

  3. run podman rootless
    3.result: nvidia-smi works!

  4. reboot machine, run podman rootless
    4.result: fails again with same error as Add README image nvidia-docker#1

Conclusion: running nvidia container with podman as root changes the environment for rootless to work. Environment cleared on reboot.

One other comment: podman as root and rootless podman cannot run with the same /etc/nvidia-container-runtime/config.toml - no-cgroups must =false for root and =true for rootless

@rhatdan
Copy link
Author

rhatdan commented Oct 25, 2019

If the nvidia hook is doing any privileged operations like modifying /dev and adding devicenodes, then this will not work with rootless. (In rootless all processes are running with the Users UID. Probably when you run rootfull, it is doing the privileged operations, so the next time you run rootless, those activities do not need to be done.

I would suggest for rootless systems, that the /dev and nvidia ops be done as a systemd unit file, so the system is preconfigured and then the rootless jobs will work fine.

@eaepstein
Copy link

After running nvidia/cuda with rootfull podman, the following exist:
crw-rw-rw-. 1 root root 195, 254 Oct 25 09:11 nvidia-modeset
crw-rw-rw-. 1 root root 195, 255 Oct 25 09:11 nvidiactl
crw-rw-rw-. 1 root root 195, 0 Oct 25 09:11 nvidia0
crw-rw-rw-. 1 root root 241, 1 Oct 25 09:11 nvidia-uvm-tools
crw-rw-rw-. 1 root root 241, 0 Oct 25 09:11 nvidia-uvm

None of these devices exist after boot. Running nvidia-smi rootless (no podman) creates:
crw-rw-rw-. 1 root root 195, 0 Oct 25 13:40 nvidia0
crw-rw-rw-. 1 root root 195, 255 Oct 25 13:40 nvidiactl

I created the other three entries using "sudo mknod -m 666 etc..." but that is not enough to run rootless. Something else is needed in the environment.

Running nvidia/cuda with rootfull podman at boot would work, but not pretty.

Thanks for the suggestion

@flx42
Copy link
Member

flx42 commented Oct 25, 2019

This behavior is documented in our installation guide:
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-verifications

From a userns you can't mknod or use nvidia-modprobe. But, if this binary is present and if it can be called in a context where setuid works, it's an option.

There is already nvidia-persistenced as a systemd unit file, but it won't load the nvidia_uvm kernel modules nor create the device files, IIRC.

Another option is to use udev rules, which is what Ubuntu is doing:

$ cat /lib/udev/rules.d/71-nvidia.rules 
[...]

# Load and unload nvidia-uvm module
ACTION=="add", DEVPATH=="/bus/pci/drivers/nvidia", RUN+="/sbin/modprobe nvidia-uvm"
ACTION=="remove", DEVPATH=="/bus/pci/drivers/nvidia", RUN+="/sbin/modprobe -r nvidia-uvm"

# This will create the device nvidia device nodes
ACTION=="add", DEVPATH=="/bus/pci/drivers/nvidia", RUN+="/usr/bin/nvidia-smi"

# Create the device node for the nvidia-uvm module
ACTION=="add", DEVPATH=="/module/nvidia_uvm", SUBSYSTEM=="module", RUN+="/sbin/create-uvm-dev-node"

@rhatdan
Copy link
Author

rhatdan commented Oct 26, 2019

Udev rules makes sense to me.

@eaepstein
Copy link

@flx42
sudo'ing the setup script in "4.5. Device Node Verification" is the only thing needed to get rootless nvidia/cuda containers running for us. It created the following devices:
crw-rw-rw-.  1 root root    195,   0 Oct 27 20:38 nvidia0
crw-rw-rw-.  1 root root    195, 255 Oct 27 20:38 nvidiactl
crw-rw-rw-.  1 root root    241,   0 Oct 27 20:38 nvidia-uvm

The udev file only created the first two and was not sufficient by itself.
We'll go with a unit file for the setup script.

Many thanks for your help.

@qhaas
Copy link

qhaas commented Jan 8, 2020

Thanks guys, with insight from this issue and others, I was able to get podman working with my Quadro in EL7 using sudo podman run --privileged --rm --hooks-dir /usr/share/containers/oci/hooks.d docker.io/nvidia/cudagl:10.1-runtime-centos7 nvidia-smi after installing the 'nvidia-container-toolkit' package.

Once the dust settles on how to get GPU support in rootless podman in EL7, a step-by-step guide would make for a great blog post and/or entry into the podman and/or nvidia documentation.

@dagrayvid
Copy link

Hello @nvjmayo and @rhatdan. I'm wondering if there is an update on this issue or this one for how to access NVIDIA GPU's from containers run rootless with podman.

On RHEL8.1, with default /etc/nvidia-container-runtime/config.toml, and running containers with root, GPU access works as expected. Rootless does not work by default, it fails with cgroup related errors (as expected).

After modifying the config.toml file -- setting no-cgroups = true and changing the debug log file -- rootless works. However, these changes make GPU access fail in containers run as root, with error "Failed to initialize NVML: Unknown Error."

Please let me know if there is any recent documentation on how to do this beyond these two issues.

@jamescassell
Copy link

Steps to get it working on RHEL 8.1:

  1. Install Nvidia Drivers, make sure nvidia-smi works on the host
  2. Install nvidia-container-toolkit from repos at
baseurl=https://nvidia.github.io/libnvidia-container/centos7/$basearch
baseurl=https://nvidia.github.io/nvidia-container-runtime/centos7/$basearch
  1. Modify /etc/nvidia-container-runtime/config.toml and change these values:
[nvidia-container-cli]
#no-cgroups = false
no-cgroups = true
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
debug = "~/./local/nvidia-container-runtime.log"
  1. run it rootless as podman run --rm --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ nvidia/cuda:10.2-devel-ubi8 /usr/bin/nvidia-smi

@zvonkok
Copy link

zvonkok commented Mar 27, 2020

/cc @dagrayvid

@dagrayvid
Copy link

dagrayvid commented Mar 30, 2020

Thanks @jamescassell.

I repeated those steps on RHEL8.1, and nvidia-smi works as expected when running rootless. However, once those changes are made, I am unable to run nvidia-smi in a container run as root. Is this behaviour expected, or is there some change in CLI flags needed when running as root? Running as root did work before making these changes.

Is there a way to configure a system so that we can utilize GPUs with podman as root and non-root user?

@andrewssobral
Copy link

I can't run podman rootless with GPU, someone can help me?

docker run --runtime=nvidia --privileged nvidia/cuda nvidia-smi works fine but
podman run --runtime=nvidia --privileged nvidia/cuda nvidia-smi crashes, same for
sudo podman run --runtime=nvidia --privileged nvidia/cuda nvidia-smi

Output:

$ podman run --runtime=nvidia --privileged nvidia/cuda nvidia-smi
2020/04/03 13:34:52 ERROR: /usr/bin/nvidia-container-runtime: find runc path: exec: "runc": executable file not found in $PATH
Error: `/usr/bin/nvidia-container-runtime start e3ccb660bf27ce0858ee56476e58b53cd3dc900e8de80f08d10f3f844c0e9f9a` failed: exit status 1

But, runc exists:

$ whereis runc
runc: /usr/bin/runc
$ whereis docker-runc
docker-runc:
$ podman --version
podman version 1.8.2
$ cat ~/.config/containers/libpod.conf
# libpod.conf is the default configuration file for all tools using libpod to
# manage containers

# Default transport method for pulling and pushing for images
image_default_transport = "docker://"

# Paths to look for the conmon container manager binary.
# If the paths are empty or no valid path was found, then the `$PATH`
# environment variable will be used as the fallback.
conmon_path = [
            "/usr/libexec/podman/conmon",
            "/usr/local/libexec/podman/conmon",
            "/usr/local/lib/podman/conmon",
            "/usr/bin/conmon",
            "/usr/sbin/conmon",
            "/usr/local/bin/conmon",
            "/usr/local/sbin/conmon",
            "/run/current-system/sw/bin/conmon",
]

# Environment variables to pass into conmon
conmon_env_vars = [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
]

# CGroup Manager - valid values are "systemd" and "cgroupfs"
#cgroup_manager = "systemd"

# Container init binary
#init_path = "/usr/libexec/podman/catatonit"

# Directory for persistent libpod files (database, etc)
# By default, this will be configured relative to where containers/storage
# stores containers
# Uncomment to change location from this default
#static_dir = "/var/lib/containers/storage/libpod"

# Directory for temporary files. Must be tmpfs (wiped after reboot)
#tmp_dir = "/var/run/libpod"
tmp_dir = "/run/user/1000/libpod/tmp"

# Maximum size of log files (in bytes)
# -1 is unlimited
max_log_size = -1

# Whether to use chroot instead of pivot_root in the runtime
no_pivot_root = false

# Directory containing CNI plugin configuration files
cni_config_dir = "/etc/cni/net.d/"

# Directories where the CNI plugin binaries may be located
cni_plugin_dir = [
               "/usr/libexec/cni",
               "/usr/lib/cni",
               "/usr/local/lib/cni",
               "/opt/cni/bin"
]

# Default CNI network for libpod.
# If multiple CNI network configs are present, libpod will use the network with
# the name given here for containers unless explicitly overridden.
# The default here is set to the name we set in the
# 87-podman-bridge.conflist included in the repository.
# Not setting this, or setting it to the empty string, will use normal CNI
# precedence rules for selecting between multiple networks.
cni_default_network = "podman"

# Default libpod namespace
# If libpod is joined to a namespace, it will see only containers and pods
# that were created in the same namespace, and will create new containers and
# pods in that namespace.
# The default namespace is "", which corresponds to no namespace. When no
# namespace is set, all containers and pods are visible.
#namespace = ""

# Default infra (pause) image name for pod infra containers
infra_image = "k8s.gcr.io/pause:3.1"

# Default command to run the infra container
infra_command = "/pause"

# Determines whether libpod will reserve ports on the host when they are
# forwarded to containers. When enabled, when ports are forwarded to containers,
# they are held open by conmon as long as the container is running, ensuring that
# they cannot be reused by other programs on the host. However, this can cause
# significant memory usage if a container has many ports forwarded to it.
# Disabling this can save memory.
#enable_port_reservation = true

# Default libpod support for container labeling
# label=true

# The locking mechanism to use
lock_type = "shm"

# Number of locks available for containers and pods.
# If this is changed, a lock renumber must be performed (e.g. with the
# 'podman system renumber' command).
num_locks = 2048

# Directory for libpod named volumes.
# By default, this will be configured relative to where containers/storage
# stores containers.
# Uncomment to change location from this default.
#volume_path = "/var/lib/containers/storage/volumes"

# Selects which logging mechanism to use for Podman events.  Valid values
# are `journald` or `file`.
# events_logger = "journald"

# Specify the keys sequence used to detach a container.
# Format is a single character [a-Z] or a comma separated sequence of
# `ctrl-<value>`, where `<value>` is one of:
# `a-z`, `@`, `^`, `[`, `\`, `]`, `^` or `_`
#
# detach_keys = "ctrl-p,ctrl-q"

# Default OCI runtime
runtime = "runc"

# List of the OCI runtimes that support --format=json.  When json is supported
# libpod will use it for reporting nicer errors.
runtime_supports_json = ["crun", "runc"]

# List of all the OCI runtimes that support --cgroup-manager=disable to disable
# creation of CGroups for containers.
runtime_supports_nocgroups = ["crun"]

# Paths to look for a valid OCI runtime (runc, runv, etc)
# If the paths are empty or no valid path was found, then the `$PATH`
# environment variable will be used as the fallback.
[runtimes]
runc = [
            "/usr/bin/runc",
            "/usr/sbin/runc",
            "/usr/local/bin/runc",
            "/usr/local/sbin/runc",
            "/sbin/runc",
            "/bin/runc",
            "/usr/lib/cri-o-runc/sbin/runc",
            "/run/current-system/sw/bin/runc",
]

crun = [
                "/usr/bin/crun",
                "/usr/sbin/crun",
                "/usr/local/bin/crun",
                "/usr/local/sbin/crun",
                "/sbin/crun",
                "/bin/crun",
                "/run/current-system/sw/bin/crun",
]

nvidia = ["/usr/bin/nvidia-container-runtime"]

# Kata Containers is an OCI runtime, where containers are run inside lightweight
# Virtual Machines (VMs). Kata provides additional isolation towards the host,
# minimizing the host attack surface and mitigating the consequences of
# containers breakout.
# Please notes that Kata does not support rootless podman yet, but we can leave
# the paths below blank to let them be discovered by the $PATH environment
# variable.

# Kata Containers with the default configured VMM
kata-runtime = [
    "/usr/bin/kata-runtime",
]

# Kata Containers with the QEMU VMM
kata-qemu = [
    "/usr/bin/kata-qemu",
]

# Kata Containers with the Firecracker VMM
kata-fc = [
    "/usr/bin/kata-fc",
]

# The [runtimes] table MUST be the last thing in this file.
# (Unless another table is added)
# TOML does not provide a way to end a table other than a further table being
# defined, so every key hereafter will be part of [runtimes] and not the main
# config.
$ cat /etc/nvidia-container-runtime/config.toml
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"

[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
debug = "/tmp/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
no-cgroups = true
#user = "root:video"
ldconfig = "@/sbin/ldconfig.real"

[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
debug = "/tmp/nvidia-container-runtime.log
$ cat /tmp/nvidia-container-runtime.log
2020/04/03 13:23:02 Running /usr/bin/nvidia-container-runtime
2020/04/03 13:23:02 Using bundle file: /home/andrews/.local/share/containers/storage/vfs-containers/614cb26f8f4719e3aba56be2e1a6dc29cd91ae760d9fe3bf83d6d1b24becc638/userdata/config.json
2020/04/03 13:23:02 prestart hook path: /usr/bin/nvidia-container-runtime-hook
2020/04/03 13:23:02 Prestart hook added, executing runc
2020/04/03 13:23:02 Looking for "docker-runc" binary
2020/04/03 13:23:02 "docker-runc" binary not found
2020/04/03 13:23:02 Looking for "runc" binary
2020/04/03 13:23:02 Runc path: /usr/bin/runc
2020/04/03 13:23:09 Running /usr/bin/nvidia-container-runtime
2020/04/03 13:23:09 Command is not "create", executing runc doing nothing
2020/04/03 13:23:09 Looking for "docker-runc" binary
2020/04/03 13:23:09 "docker-runc" binary not found
2020/04/03 13:23:09 Looking for "runc" binary
2020/04/03 13:23:09 ERROR: find runc path: exec: "runc": executable file not found in $PATH
2020/04/03 13:31:06 Running nvidia-container-runtime
2020/04/03 13:31:06 Command is not "create", executing runc doing nothing
2020/04/03 13:31:06 Looking for "docker-runc" binary
2020/04/03 13:31:06 "docker-runc" binary not found
2020/04/03 13:31:06 Looking for "runc" binary
2020/04/03 13:31:06 Runc path: /usr/bin/runc
$ nvidia-container-runtime --version
runc version 1.0.0-rc8
commit: 425e105d5a03fabd737a126ad93d62a9eeede87f
spec: 1.0.1-dev
NVRM version:   440.64.00
CUDA version:   10.2

Device Index:   0
Device Minor:   0
Model:          GeForce RTX 2070
Brand:          GeForce
GPU UUID:       GPU-22dfd02e-a668-a6a6-a90a-39d6efe475ee
Bus Location:   00000000:01:00.0
Architecture:   7.5
$ docker version
Client:
 Version:           18.09.7
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        2d0083d
 Built:             Thu Jun 27 17:56:23 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b7f0
  Built:            Wed Mar 11 01:24:19 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

@jamescassell
Copy link

See particularly step 4. #85 (comment)

@rhatdan
Copy link
Author

rhatdan commented Apr 3, 2020

This looks like the nvidia plugin is searching for a hard coded path to runc?

@andrewssobral
Copy link

andrewssobral commented Apr 3, 2020

[updated] Hi @jamescassell , unfortunately do not work for me.
(same error using sudo)

$ podman run --rm --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ --runtime=nvidia nvidia/cudanvidia-smi
2020/04/03 17:33:06 ERROR: /usr/bin/nvidia-container-runtime: find runc path: exec: "runc": executable file not found in $PATH
2020/04/03 17:33:06 ERROR: /usr/bin/nvidia-container-runtime: find runc path: exec: "runc": executable file not found in $PATH
Error: `/usr/bin/nvidia-container-runtime start 060398d97299ee033e8ebd698a11c128bd80ce641dd389976ca43a34b26abab3` failed: exit status 1

@Ru13en
Copy link

Ru13en commented Jun 17, 2021

@elezar Thanks, i've opened another issue:
#142

@fuomag9
Copy link

fuomag9 commented Jul 18, 2021

For anybody who has the same issue as me ("nvidia-smi": executable file not found in $PATH: OCI not found or no NVIDIA GPU device is present: /dev/nvidia0 does not exist, this is how I made it work on kubuntu 21.04 rootless:

Add your user to group video if not present:
usermod -a -G video $USER

/usr/share/containers/oci/hooks.d/oci-nvidia-hook.json:

{
  "version": "1.0.0",
  "hook": {
    "path": "/usr/bin/nvidia-container-runtime-hook",
    "args": ["/usr/bin/nvidia-container-runtime-hook", "prestart"],
    "env": []
  },
  "when": {
    "always": true
  },
  "stages": ["prestart"]
}

/etc/nvidia-container-runtime/config.toml:

disable-require = false

[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-runtime-hook.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
no-cgroups = true
#user = "root:video"
ldconfig = "@/sbin/ldconfig.real"

podman run -it --group-add video docker.io/tensorflow/tensorflow:latest-gpu-jupyter nvidia-smi

Sun Jul 18 11:45:06 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.31       Driver Version: 465.31       CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:09:00.0  On |                  N/A |
| 31%   43C    P8     6W / 215W |   2582MiB /  7979MiB |      9%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

@helson73
Copy link

helson73 commented Oct 1, 2021

@rhatdan @nvjmayo Turns out that getting rootless podman working with nvidia on centos 7 is a bit more complicated, at least for us.

Here is our scenario on brand new centos 7.7 machine

  1. run nvidia-smi with rootless podman
    1.result: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused "process_linux.go:413: running prestart hook 0 caused "error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: cuda error: unknown error\n""
  2. run podman with user=root
    2.result: nvidia-smi works
  3. run podman rootless
    3.result: nvidia-smi works!
  4. reboot machine, run podman rootless
    4.result: fails again with same error as Plugin requirements #1

Conclusion: running nvidia container with podman as root changes the environment for rootless to work. Environment cleared on reboot.

One other comment: podman as root and rootless podman cannot run with the same /etc/nvidia-container-runtime/config.toml - no-cgroups must =false for root and =true for rootless

Hi, have you figure out the solution?
I have exactly the same symptom as yours.

Rootless running only works after launching container as root at least once. And reboot reset everything.
I am using RHEL 8.4 and can't believe this still happens after one year ...

@qhaas
Copy link

qhaas commented Nov 17, 2021

For those dropping into this issue, nvidia has documented getting GPU acceleration working with podman.

@fuomag9
Copy link

fuomag9 commented Nov 17, 2021

For those dropping into this issue, nvidia has documented getting GPU acceleration working with podman.

That's awesome! The documentation is almost the same as my fix here in this thread :D

@rhatdan
Copy link
Author

rhatdan commented Nov 17, 2021

any chance they can update the version of podman in example. That one is pretty old.

@KCSesh
Copy link

KCSesh commented Nov 18, 2021

@fuomag9 Are you using crun as opposed to runc out of curiosity?
Does it work with both in rootless for you? Or just crun?

@fuomag9
Copy link

fuomag9 commented Nov 18, 2021

@fuomag9 Are you using crun as opposed to runc out of curiosity? Does it work with both in rootless for you? Or just crun?

Working for me with both runc and crun set via /etc/containers/containers.conf with runtime = "XXX"

@qhaas
Copy link

qhaas commented Dec 6, 2021

--hooks-dir /usr/share/containers/oci/hooks.d/ does not seem to be needed anymore, at least with podman 3.3.1 and nvidia-container-toolkit 1.7.0.

For RHEL8 systems where selinux is enforcing, it it 'best practice' to add the nvidia selinux policy module and run podman with --security-opt label=type:nvidia_container_t (per RH documentation, even on non-DGX systems) or just run podman with --security-opt=label=disable (per nvidia documentation)? Unclear if there is any significant benefit to warrant messing with SELinux policy.

@decandia50
Copy link

decandia50 commented Sep 21, 2022

For folks finding this issue, especially anyone trying to do this on RHEL8 after following https://www.redhat.com/en/blog/how-use-gpus-containers-bare-metal-rhel-8, here's the current status/known issues that I've encountered. Hopefully this saves someone some time.

As noted in the comments above you can run containers as root without issue, but if you try to use --userns keep-id you're going to have a bad day.

Things that need to be done ahead of time to run rootless containers are documented in https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#step-3-rootless-containers-setup but the cheat sheet version is:

  1. Install nvidia-container-toolkit
  2. Update /etc/nvidia-container-runtime/config.toml and set no-cgroups = true
  3. Use NVIDIA_VISIBILE_DEVICES as part of your podman environment.
  4. Specify --hooks-dir=/usr/share/containers/oci/hooks.d/ (may not strictly be needed).

If you do that, then running: podman run -e NVIDIA_VISIBLE_DEVICES=all --hooks-dir=/usr/share/containers/oci/hooks.d/ --rm -ti myimage nvidia-smi should result in the usual nvidia-smi output. But, you'll note that the user in the container is root and that may not be what you want. If you use --userns keep-id; e.g. podman run --userns keep-id -e NVIDIA_VISIBLE_DEVICES=all --hooks-dir=/usr/share/containers/oci/hooks.d/ --rm -ti myimage nvidia-smi you will get an error that states: Error: OCI runtime error: crun: error executing hook /usr/bin/nvidia-container-toolkit (exit code: 1). From my reading above the checks that are run require the user to be root in the container.

Now for the workaround. You don't need this hook, you just need the nvidia-container-cli tool. All the hook really does is mount the correct libraries, devices, and binaries from the underlying system into the container. We can use nvidia-container-cli -k list and find to accomplish the same thing. Here's my one-liner below. Note that I'm excluding both -e NVIDIA_VISIBILE_DEVICES=all and --hooks-dir=/usr/share/containers/oci/hooks.d/.

Here's what it looks like:
podman run --userns keep-id $(for file in $(nvidia-container-cli -k list); do find -L $(dirname $file) -xdev -samefile $file; done | awk '{print " -v "$1":"$1}' | xargs) --rm -ti myimage nvidia-smi

This is what the above is doing. We run nvidia-container-cli -k list which on my system produces output like:

$ nvidia-container-cli -k list
/dev/nvidiactl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvidia-modeset
/dev/nvidia0
/dev/nvidia1
/usr/bin/nvidia-smi
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-persistenced
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/lib64/libnvidia-ml.so.470.141.03
/usr/lib64/libnvidia-cfg.so.470.141.03
/usr/lib64/libcuda.so.470.141.03
/usr/lib64/libnvidia-opencl.so.470.141.03
/usr/lib64/libnvidia-ptxjitcompiler.so.470.141.03
/usr/lib64/libnvidia-allocator.so.470.141.03
/usr/lib64/libnvidia-compiler.so.470.141.03
/usr/lib64/libnvidia-ngx.so.470.141.03
/usr/lib64/libnvidia-encode.so.470.141.03
/usr/lib64/libnvidia-opticalflow.so.470.141.03
/usr/lib64/libnvcuvid.so.470.141.03
/usr/lib64/libnvidia-eglcore.so.470.141.03
/usr/lib64/libnvidia-glcore.so.470.141.03
/usr/lib64/libnvidia-tls.so.470.141.03
/usr/lib64/libnvidia-glsi.so.470.141.03
/usr/lib64/libnvidia-fbc.so.470.141.03
/usr/lib64/libnvidia-ifr.so.470.141.03
/usr/lib64/libnvidia-rtcore.so.470.141.03
/usr/lib64/libnvoptix.so.470.141.03
/usr/lib64/libGLX_nvidia.so.470.141.03
/usr/lib64/libEGL_nvidia.so.470.141.03
/usr/lib64/libGLESv2_nvidia.so.470.141.03
/usr/lib64/libGLESv1_CM_nvidia.so.470.141.03
/usr/lib64/libnvidia-glvkspirv.so.470.141.03
/usr/lib64/libnvidia-cbl.so.470.141.03
/lib/firmware/nvidia/470.141.03/gsp.bin

We then loop through each of those files and run find -L $(dirname $file) -xdev -samefile $file That finds all the symlinks to a given file. e.g.

find -L /usr/lib64 -xdev -samefile /usr/lib64/libnvidia-ml.so.470.141.03
/usr/lib64/libnvidia-ml.so.1
/usr/lib64/libnvidia-ml.so.470.141.03
/usr/lib64/libnvidia-ml.so

We loop through each of those files and use awk and xargs to create the podman cli arguments to bind mount these files into the container; e.g. -v /usr/lib64/libnvidia-ml.so.1:/usr/lib64/libnvidia-ml.so.1 -v /usr/lib64/libnvidia-ml.so.470.141.03:/usr/lib64/libnvidia-ml.so.470.141.03 -v /usr/lib64/libnvidia-ml.so:/usr/lib64/libnvidia-ml.so etc.

This effectively does what the hook does, using the tools the hook provides, but does not require the user running the container to be root, and does not require the user inside of the container to be root.

Hopefully this saves someone else a few hours.

@baude
Copy link

baude commented Sep 21, 2022

@decandia50 excellent information! your information really deserves to be highlighted. would you consider posting as a blog if we connect you with some people?

@klueska
Copy link
Contributor

klueska commented Sep 21, 2022

Please do not write a blog post with the above information. While the procedure may work on some setups, it is not a supported use of the nvidia-container-cli tool and will only work correctly und a very narrow set of assumptions.

The better solution is to use podman's integrated CDI support to have podman do the work that libnvidia-container would have otherwise done instead. The future of the nvidia stack (and device support in container runtimes in general) is CDI, and starting to use this method now will future proof how you access generic devices in the future.

Please see below for details on CDI:
https://github.com/container-orchestrated-devices/container-device-interface

We have spent the last year rearchitecting the NVIDIA container stack to work together with CDI, and as part of this have a tool coming out with the next release that will be able to generate CDI specs for nvidia devices for use with podman (and any other CDI compatible runtimes).

In the meantime, you can generate a CDI spec manually, or wait for @elezar to comment on a better method to get a CDI spec generated today.

@klueska
Copy link
Contributor

klueska commented Sep 21, 2022

Here is an example of a (fully functional) CDI spec on my DGX-A100 machine (excluding MIG devices):

cdiVersion: 0.4.0
kind: nvidia.com/gpu
containerEdits:
  hooks:
  - hookName: createContainer
    path: /usr/bin/nvidia-ctk
    args:
    - /usr/bin/nvidia-ctk
    - hook
    - update-ldcache
    - --folder
    - /usr/lib/x86_64-linux-gnu
  deviceNodes:
  - path: /dev/nvidia-modeset
  - path: /dev/nvidiactl
  - path: /dev/nvidia-uvm
  - path: /dev/nvidia-uvm-tools
  mounts:
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libcuda.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libcuda.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvcuvid.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvcuvid.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvoptix.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvoptix.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGL.so.1.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libGL.so.1.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libEGL.so.1.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libEGL.so.1.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM.so.1.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM.so.1.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv2.so.2.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv2.so.2.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-smi
    hostPath: /usr/bin/nvidia-smi
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-debugdump
    hostPath: /usr/bin/nvidia-debugdump
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-persistenced
    hostPath: /usr/bin/nvidia-persistenced
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-cuda-mps-control
    hostPath: /usr/bin/nvidia-cuda-mps-control
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-cuda-mps-server
    hostPath: /usr/bin/nvidia-cuda-mps-server
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /var/run/nvidia-persistenced/socket
    hostPath: /var/run/nvidia-persistenced/socket
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /var/run/nvidia-fabricmanager/socket
    hostPath: /var/run/nvidia-fabricmanager/socket
    options:
    - ro
    - nosuid
    - nodev
    - bind
devices:
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia0
  name: gpu0
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia1
  name: gpu1
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia2
  name: gpu2
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia3
  name: gpu3
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia4
  name: gpu4
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia5
  name: gpu5
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia6
  name: gpu6
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia7
  name: gpu7

@decandia50
Copy link

@elezar Can you comment on the availability of a tool to generate the CDI spec as proposed by @klueska? I'm happy to use CDI if that's the way forward. Also happy to beta test a tool if you point me towards something.

@Ru13en
Copy link

Ru13en commented Sep 22, 2022

Maintaining a nvidia.json CDI spec file for multiple machines with different Nvidia drivers and other libs is a bit painful.
For instance, NVIDIA driver installer should create libnvidia-compiler.so symlink to libnvidia-compiler.so.460.91.03, etc...
The CDI nvidia.json will just take the symlinks to avoid the manual setting of all mappings for a particular driver version...
I am already using CDI specs in our machines but I would like to test a tool to generate de CDI spec for any system...

@elezar
Copy link
Member

elezar commented Sep 23, 2022

@Ru13en we have a WIP Merge Request that adds an:

nvidia-ctk info generate-cdi

command to the NVIDIA Container Toolkit. This idea being that this could be run at boot or triggered on a driver installation / upgrade. We are working at getting an v1.12.0-rc.1 out that includes this functionality for early testing and feedback.

@starry91
Copy link

@elezar Any ETA by when we can expect the v1.12.0-rc.1 version?

@elezar
Copy link
Member

elezar commented Sep 30, 2022

It will be released next week.

@starry91
Copy link

starry91 commented Oct 7, 2022

I tried using the WIP version of nvidia-ctk(from the master branch of https://gitlab.com/nvidia/container-toolkit/container-toolkit) and was able to get it working with rootless podman, but not without issues. I have documented them in https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/issues/8.
@rhatdan The upcoming version of nvidia cdi generator will be using cdi version 0.5.0 while the latest podman version 4.2.0 still uses 0.4.0. Any idea when 4.3.0 might be available? (I see that 4.3.0-rc1 uses 0.5.0)

@elezar
Copy link
Member

elezar commented Oct 7, 2022

Thanks for the confirmation @starry91. The official release of v1.12.0-rc.1 has been delayed a little but thanks for testing the toolking nonetheless. I will have a look a the issue you created and update the tooling before releasing the rc.

@elezar
Copy link
Member

elezar commented Oct 20, 2023

We have recently updated our Podman support and now recommend using CDI -- which is supported natively in more recent Podman versions.

See https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-podman for details.

@elezar elezar closed this as completed Oct 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests