Open
Description
Issue Description
"Device requests" are how GPUs are invoked from the Docker API. However, device requests are not being respected by Podman when creating a container over the Podman socket.
Steps to reproduce the issue
Here is a Python script which tests the Docker and Podman socket APIs.
Setup: install Python version 3.10+ and run pip install docker==7.0.0
Run these tests:
import subprocess as sp
import docker
import docker.types
# Setup: create unix socket clients
# --------------------------------------------------------------------------------
podman_socket = sp.check_output(['podman', 'info', '--format', '{{ .Host.RemoteSocket.Path }}'], text=True).strip()
podman_client = docker.DockerClient(base_url=f'unix://{podman_socket}')
docker_client = docker.DockerClient(base_url=f'unix:///var/run/docker.sock')
# Sanity checks: assert podman is working
# --------------------------------------------------------------------------------
assert b'!... Hello Podman World ...!' in podman_client.containers.run('quay.io/podman/hello', auto_remove=True)
# Sanity checks: assert podman and docker both work with nvidia-container-toolkit
# --------------------------------------------------------------------------------
def test_nvidia_smi_works_using_command(command: str):
assert sp.check_output([command, 'run', '--rm', '--gpus=all', 'registry.access.redhat.com/ubi9:9.4-947.1714667021', 'nvidia-smi', '-L']).startswith(b'GPU 0')
test_nvidia_smi_works_using_command('docker')
test_nvidia_smi_works_using_command('podman')
# Bug reproduction cases
# --------------------------------------------------------------------------------
GPU_REQUEST = {
'device_requests': [ docker.types.DeviceRequest(count=1, capabilities=[['gpu']]) ]
}
def test_nvidia_smi_works_using_client(client: docker.DockerClient):
assert client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], **GPU_REQUEST).startswith(b'GPU 0')
test_nvidia_smi_works_using_client(docker_client) # pass
test_nvidia_smi_works_using_client(podman_client) # fail
def test_device_request_goes_through(client: docker.DockerClient):
container = client.containers.run('registry.access.redhat.com/ubi9:9.4-947.1714667021', ['nvidia-smi', '-L'], detach=True, **GPU_REQUEST)
assert len(container.attrs['HostConfig']['DeviceRequests']) > 0
assert any(request.get('Capabilities', None) == ['gpu'] for request in container.attrs['HostConfig']['DeviceRequests'])
test_device_request_goes_through(docker_client) # pass
test_device_request_goes_through(podman_client) # fail
Describe the results you received
- It should be possible to create containers with GPUs over the Podman socket API
- The created container should have a non-empty value for
.HostConfig.DeviceRequests
Describe the results you expected
- Device request is not honored when creating container via Podman socket
podman info output
host:
arch: amd64
buildahVersion: 1.35.3
cgroupControllers:
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: /usr/bin/conmon is owned by conmon 1:2.1.11-1
path: /usr/bin/conmon
version: 'conmon version 2.1.10, commit: e21e7c85b7637e622f21c57675bf1154fc8b1866'
cpuUtilization:
idlePercent: 94.1
systemPercent: 1.54
userPercent: 4.36
cpus: 20
databaseBackend: boltdb
distribution:
distribution: arch
version: unknown
eventLogger: journald
freeLocks: 2012
hostname: geo
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.8.9-arch1-1
linkmode: dynamic
logDriver: journald
memFree: 97578004480
memTotal: 134802944000
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: /usr/lib/podman/aardvark-dns is owned by aardvark-dns 1.10.0-2
path: /usr/lib/podman/aardvark-dns
version: aardvark-dns 1.10.0
package: /usr/lib/podman/netavark is owned by netavark 1.10.3-1
path: /usr/lib/podman/netavark
version: netavark 1.10.3
ociRuntime:
name: crun
package: /usr/bin/crun is owned by crun 1.15-1
path: /usr/bin/crun
version: |-
crun version 1.15
commit: e6eacaf4034e84185fd8780ac9262bbf57082278
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: /usr/bin/pasta is owned by passt 2024_04_26.d03c4e2-1
version: |
pasta 2024_04_26.d03c4e2
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /etc/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: /usr/bin/slirp4netns is owned by slirp4netns 1.3.0-1
version: |-
slirp4netns version 1.3.0
commit: 8a4d4391842f00b9c940bb8f067964427eb0c964
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.5
swapFree: 0
swapTotal: 0
uptime: 1h 31m 25.00s (Approximately 0.04 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries: {}
store:
configFile: /home/jenni/.config/containers/storage.conf
containerStore:
number: 20
paused: 0
running: 14
stopped: 6
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/jenni/.local/share/containers/storage
graphRootAllocated: 1578640605184
graphRootUsed: 1019202039808
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 56
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/jenni/.local/share/containers/storage/volumes
version:
APIVersion: 5.0.2
Built: 1713438799
BuiltTime: Thu Apr 18 07:13:19 2024
GitCommit: 3304dd95b8978a8346b96b7d43134990609b3b29-dirty
GoVersion: go1.22.2
Os: linux
OsArch: linux/amd64
Version: 5.0.2
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
$ nvidia-container-cli info
NVRM version: 550.78
CUDA version: 12.4
Device Index: 0
Device Minor: 0
Model: NVIDIA GeForce RTX 3080 Ti
Brand: GeForce
GPU UUID: GPU-c61acb21-8716-6540-271c-39beab917d03
Bus Location: 00000000:01:00.0
Architecture: 8.6
Additional information
No response