Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--gpu fails with --podman #293

Closed
awerlang opened this issue Oct 27, 2020 · 19 comments
Closed

--gpu fails with --podman #293

awerlang opened this issue Oct 27, 2020 · 19 comments

Comments

@awerlang
Copy link

awerlang commented Oct 27, 2020

Context: I'm troubleshooting a freezing issue with an electron app (Visual Studio Code) running as a container. I'm clueless about what the real issue might be but two things that occurred to me is the lack of gpu access and /dev/shm. I'm on a nvidia system but the same issue reproduces on an amd system.

To move forward with this, I'm using x11docker with a simpler container on the amd system, using podman rootless:

x11docker --podman --cap-default --gpu x11docker/xfce glxgears

It renders a black screen. Without the --gpu flag it renders correctly using a lot of cpu. On the host it renders using the gpu.

The command above translates to:

podman run --tty --detach \
  --name x11docker_X114_x11docker-xfce-glxgears_69062233839 \
  --user 1000:100 \
  --env USER=user \
  --userns=keep-id \
  --group-add 481 \
  --group-add 483 \
  --security-opt label=type:container_runtime_t \
  --volume '/usr/bin/catatonit':'/usr/local/bin/init':ro \
  --tmpfs /run --tmpfs /run/lock \
  --volume '/home/user/.cache/x11docker/x11docker-xfce-glxgears-69062233839/share':'/x11docker':rw \
  --device '/dev/dri':'/dev/dri':rw \
  --device '/dev/vga_arbiter':'/dev/vga_arbiter':rw \
  --volume '/tmp/.X11-unix/X114':'/X114':rw \
  --workdir '/tmp' \
  --entrypoint env \
  --env 'container=docker' \
  --env 'XAUTHORITY=/x11docker/Xauthority.client' \
  --env 'DISPLAY=:114' \
  --env 'HOME=/home/user' \
  --env 'XDG_RUNTIME_DIR=/tmp/XDG_RUNTIME_DIR' \
  -- x11docker/xfce /usr/local/bin/init -- /bin/sh - /x11docker/containerrc

I reduced it to:

podman run --rm \
    --device '/dev/dri':'/dev/dri':rw \
    --device '/dev/vga_arbiter':'/dev/vga_arbiter':rw \
    --volume /tmp/.X11-unix:/tmp/.X11-unix \
    --volume $XAUTHORITY:/tmp/auth \
    --tmpfs /run --tmpfs /run/lock \
    --env DISPLAY \
    --env XAUTHORITY=/tmp/auth \
    -it --group-add 483 --group-add 481 \
    x11docker/xfce glxgears

It's a slightly different command, it causes some flickering on screen which takes me to reset the display manager.

Full logs
x11docker note: Option --podman: experimental only.
  To avoid a prompt for root password, you might have to execute:
    sysctl -w kernel.unprivileged_userns_clone=1
  Please report issues at: https://github.com/mviereck/x11docker/issues/255

DEBUGNOTE[14:01:28,800]: traperror: Command at Line 6379 returned with error code 1:
 grep 172.17.0.1
 8607 - ::check_host::main::main
DEBUGNOTE[14:01:28,805]: time to say goodbye (traperror)
DEBUGNOTE[14:01:28,810]: traperror: Command at Line 6379 returned with error code 1:
 Hostip="$(ip -4 -o a | grep 'docker0' | awk '{print $4}' | cut -d/ -f1 | grep 172.17.0.1)"
 8607 - ::check_host::main::main
DEBUGNOTE[14:01:28,814]: time to say goodbye (traperror)
DEBUGNOTE[14:01:28,865]: check_host(): ps can watch root processes: yes
DEBUGNOTE[14:01:28,896]: host user: user 1000:100 /home/user
DEBUGNOTE[14:01:29,115]: storeinfo(): cache=/home/user/.cache/x11docker/x11docker-xfce-glxgears-18088751606
DEBUGNOTE[14:01:29,123]: storeinfo(): stdout=/home/user/.cache/x11docker/x11docker-xfce-glxgears-18088751606/share/stdout
DEBUGNOTE[14:01:29,131]: storeinfo(): stderr=/home/user/.cache/x11docker/x11docker-xfce-glxgears-18088751606/share/stderr
DEBUGNOTE[14:01:29,151]: storeinfo(): x11dockerpid=22773
DEBUGNOTE[14:01:29,193]:  
x11docker version: 6.6.3-beta
docker version:    podman version 2.1.1
Host system:       "openSUSE Tumbleweed"
Host architecture: amd64 (x86_64)
Command:           '/home/user/bin/x11docker' '--debug' '--podman' '--cap-default' '--gpu' 'x11docker/xfce' 'glxgears'  
Parsed options:     --debug --podman --cap-default --gpu -- 'x11docker/xfce' 'glxgears'
DEBUGNOTE[14:01:29,198]: --xpra-xwayland: xpra not found.
 You can look for the package name of this command at:  
https://github.com/mviereck/x11docker/wiki/dependencies#table-of-all-packages
DEBUGNOTE[14:01:29,202]: --xpra-xwayland: weston not found.
 You can look for the package name of this command at:  
https://github.com/mviereck/x11docker/wiki/dependencies#table-of-all-packages
DEBUGNOTE[14:01:29,206]: --xpra-xwayland: xdotool not found.
 You can look for the package name of this command at:  
https://github.com/mviereck/x11docker/wiki/dependencies#table-of-all-packages
DEBUGNOTE[14:01:29,211]: Dependency check for --xpra-xwayland: 1
DEBUGNOTE[14:01:29,215]: --weston-xwayland: weston not found.
 You can look for the package name of this command at:  
https://github.com/mviereck/x11docker/wiki/dependencies#table-of-all-packages
DEBUGNOTE[14:01:29,220]: Dependency check for --weston-xwayland: 1
DEBUGNOTE[14:01:29,224]: Dependency check for --kwin-xwayland: 0
DEBUGNOTE[14:01:29,229]: Dependencies of --kwin-xwayland already checked: 0  
x11docker note: Using X server option --kwin-xwayland

DEBUGNOTE[14:01:29,233]: storeinfo(): xserver=--kwin-xwayland
x11docker WARNING: Option --gpu degrades container isolation.
 Container gains access to GPU hardware.
 This allows reading host window content (palinopsia leak)
 and GPU rootkits (compare proof of concept: jellyfish).

x11docker WARNING: Option --cap-default disables security hardening
 for containers done by x11docker. Default docker capabilities are allowed.
 This is considered to be less secure.

x11docker note: Option --cap-default: Enabling option --newprivileges.
 You can avoid this with --newprivileges=no

DEBUGNOTE[14:01:29,258]: container user: user 1000:100 /home/user
DEBUGNOTE[14:01:29,288]: waitforlogentry(): tailstdout: Waiting for logentry "x11docker=ready" in store.info
DEBUGNOTE[14:01:29,289]: waitforlogentry(): tailstderr: Waiting for logentry "x11docker=ready" in store.info
DEBUGNOTE[14:01:29,301]: storepid(): Stored pid '23331' of 'watchpidlist': 23331 pts/2    00:00:00 bash
DEBUGNOTE[14:01:29,316]: storepid(): Stored pid '23352' of 'watchmessagefifo': 23352 pts/2    00:00:00 bash
DEBUGNOTE[14:01:29,474]: storeinfo(): DISPLAY=:124
DEBUGNOTE[14:01:29,482]: storeinfo(): XAUTHORITY=/home/user/.cache/x11docker/x11docker-xfce-glxgears-18088751606/share/Xauthority.
client
DEBUGNOTE[14:01:29,490]: storeinfo(): XSOCKET=/tmp/.X11-unix/X124
DEBUGNOTE[14:01:29,498]: storeinfo(): WAYLAND_DISPLAY=wayland-124
DEBUGNOTE[14:01:29,506]: storeinfo(): XDG_RUNTIME_DIR=/run/user/1000
DEBUGNOTE[14:01:29,514]: storeinfo(): Xenv= DISPLAY=:124 XAUTHORITY=/home/user/.cache/x11docker/x11docker-xfce-glxgears-1808875160
6/share/Xauthority.client XSOCKET=/tmp/.X11-unix/X124 WAYLAND_DISPLAY=wayland-124 XDG_RUNTIME_DIR=/run/user/1000
DEBUGNOTE[14:01:29,639]: X server command:
 /usr/bin/Xwayland :124  
 -retro
 +extension RANDR
 +extension RENDER
 +extension GLX
 +extension XVideo
 +extension DOUBLE-BUFFER
 +extension SECURITY
 +extension DAMAGE
 +extension X-Resource
 -extension XINERAMA -xinerama
 -extension MIT-SHM
 +extension Composite +extension COMPOSITE
 -extension XTEST -tst
 -dpms
 -s off
 -auth /home/user/.cache/x11docker/x11docker-xfce-glxgears-18088751606/Xauthority.server
 -nolisten tcp
 -dpi 96
DEBUGNOTE[14:01:29,644]: Compositor command:
 env QT_XKB_CONFIG_ROOT=/usr/share/X11/xkb kwin_wayland
 --xwayland
 --socket=wayland-124
 --width=1264 --height=672
 --x11-display=:0
DEBUGNOTE[14:01:29,760]: storeinfo(): tini=/usr/bin/catatonit
DEBUGNOTE[14:01:29,770]: Users and terminal:
 x11docker was started by:                       user
 As host user serves (running X, storing cache): user
 Container user will be:                         user
 Container user password:                        x11docker
 Getting permission to run docker with:          eval  
 Terminal for password frontend:                 bash -c
 Running in a terminal:                          yes
 Running on console:                             no
 Running over SSH:                               no
 Running sourced:                                no
 bash $-:                                        huBE
x11docker WARNING: Option --newprivileges=yes: x11docker does not set  
 docker run option --security-opt=no-new-privileges.  
 That degrades container security.
 However, this is still within a default docker setup.

DEBUGNOTE[14:01:29,776]: storeinfo(): containername=x11docker_X124_x11docker-xfce-glxgears_18088751606
x11docker WARNING: Sharing device file: /dev/dri

x11docker WARNING: Sharing device file: /dev/vga_arbiter

DEBUGNOTE[14:01:30,023]: Docker command:
 podman run --tty --detach
 --name x11docker_X124_x11docker-xfce-glxgears_18088751606
 --user 1000:100
 --env USER=user
 --userns=keep-id
 --group-add 481
 --group-add 483
 --security-opt label=type:container_runtime_t
 --volume '/usr/bin/catatonit':'/usr/local/bin/init':ro
 --tmpfs /run --tmpfs /run/lock
 --volume '/home/user/.cache/x11docker/x11docker-xfce-glxgears-18088751606/share':'/x11docker':rw
 --device '/dev/dri':'/dev/dri':rw
 --device '/dev/vga_arbiter':'/dev/vga_arbiter':rw
 --volume '/tmp/.X11-unix/X124':'/X124':rw
 --workdir '/tmp'
 --entrypoint env
 --env 'container=docker'
 --env 'XAUTHORITY=/x11docker/Xauthority.client'
 --env 'DISPLAY=:124'
 --env 'HOME=/home/user'
 --env 'XDG_RUNTIME_DIR=/tmp/XDG_RUNTIME_DIR'
 -- x11docker/xfce /usr/local/bin/init -- /bin/sh - /x11docker/containerrc
DEBUGNOTE[14:01:30,301]: storepid(): Stored pid '24132' of 'containershell': 24132 pts/2    00:00:00 bash
DEBUGNOTE[14:01:30,312]: Running xtermrc: Ask for password if needed (no)
DEBUGNOTE[14:01:30,318]: waitforlogentry(): start_xserver(): Waiting for logentry "readyforX=ready" in store.info
DEBUGNOTE[14:01:30,336]: Running dockerrc: Setup as root or as user docker on host.
DEBUGNOTE[14:01:30,414]: dockerrc: Found default Runtime:  
DEBUGNOTE[14:01:30,428]: dockerrc: All  
DEBUGNOTE[14:01:30,444]: dockerrc: Container Runtime:  
DEBUGNOTE[14:01:30,460]: storeinfo(): runtime=
DEBUGNOTE[14:01:30,598]: dockerrc: Image architecture: amd64
DEBUGNOTE[14:01:30,610]: dockerrc: Image USER:  
DEBUGNOTE[14:01:30,624]: storeinfo(): containeruser=user
DEBUGNOTE[14:01:30,637]: dockerrc: Image ENTRYPOINT:  
DEBUGNOTE[14:01:30,648]: dockerrc: Image WORKDIR:  
DEBUGNOTE[14:01:30,663]: storeinfo(): readyforX=ready
DEBUGNOTE[14:01:30,675]: waitforlogentry(): dockerrc: Waiting for logentry "xinitrc is ready" in xinit.log
DEBUGNOTE[14:01:30,837]: waitforlogentry(): start_xserver(): Found log entry "readyforX=ready" in store.info.
DEBUGNOTE[14:01:30,852]: storeinfo(): compositorpid=24513
DEBUGNOTE[14:01:30,874]: waitforlogentry(): start_compositor(): Waiting for logentry "X-Server" in compositor.log
^CDEBUGNOTE[14:01:34,771]: Received SIGINT
DEBUGNOTE[14:01:34,777]: storeinfo(): error=130
DEBUGNOTE[14:01:34,785]: Terminating x11docker.
DEBUGNOTE[14:01:34,790]: time to say goodbye (finish)
DEBUGNOTE[14:01:34,809]: finish(): Checking pid 24132 (containershell): 24132 pts/2    00:00:00 bash
DEBUGNOTE[14:01:34,825]: termpid(): Terminating 24132 (containershell): 24132 pts/2    00:00:00 bash
DEBUGNOTE[14:01:34,952]: finish(): Checking pid 23352 (watchmessagefifo): 23352 pts/2    00:00:00 bash
DEBUGNOTE[14:01:34,970]: finish(): Checking pid 23331 (watchpidlist): 23331 pts/2    00:00:00 bash
DEBUGNOTE[14:01:34,983]: termpid(): Terminating 23331 (watchpidlist): 23331 pts/2    00:00:00 bash
DEBUGNOTE[14:01:35,100]: Removing container x11docker_X124_x11docker-xfce-glxgears_18088751606
Error: Failed to evict container: "": Failed to find container "x11docker_X124_x11docker-xfce-glxgears_18088751606" in state: no co
ntainer with name or ID x11docker_X124_x11docker-xfce-glxgears_18088751606 found: no such container
x11docker note: Failed to remove container x11docker_X124_x11docker-xfce-glxgears_18088751606

DEBUGNOTE[14:01:35,171]: termpid(): Terminating 23352 (watchmessagefifo): 23352 pts/2    00:00:00 bash
DEBUGNOTE[14:01:35,293]: x11docker exit code: 130
DEBUGNOTE[14:01:35,781]: waitforlogentry(): tailstderr: Stopped waiting for x11docker=ready in store.info due to termin
ating signal.
DEBUGNOTE[14:01:35,781]: waitforlogentry(): tailstdout: Stopped waiting for x11docker=ready in store.info due to terminating signal
.


UPDATE

Solution: add the privileged flag

x11docker --podman --cap-default --gpu -- --privileged -- x11docker/xfce glxgears -info

@awerlang
Copy link
Author

While the above is using podman rootless, I get the same black screen with rootful podman.

@mviereck
Copy link
Owner

mviereck commented Oct 27, 2020

In the log I found x11docker uses --kwin-xwayland to provide an accelerated X server. Maybe there is an issue with kwin.
Please install weston so x11docker will use --weston-xwayland and try again.
Option --hostdisplay is also worth a try.

Edit: it might as well be an issue with podman. Could you try with docker instead?
Edit2: Just did a check here, --gpu fails with --podman but succeeds using docker. I'll investigate.

@awerlang
Copy link
Author

Updating the issue with other data points:

  • --hostdisplay with --gpu produces a flickering, cannot use the machine until I restart graphics;

  • I installed weston, no dice, but running glxgears -info leads to:

libGL error: MESA-LOADER: failed to retrieve device information
libGL error: Version 4 or later of flush extension not found
libGL error: failed to load driver: i915

I mistakenly called this an amd system, it's really a hybrid system running on intel/amd. On the host it uses the Intel graphics. I created an image based on the host system, same failure.

  • I got it working on the nvidia system using the nvidia container toolkit to inject the appropriate drivers in the container. This is the minimal repro:
podman run --rm \
    --volume /tmp/.X11-unix:/tmp/.X11-unix \
    --volume $XAUTHORITY:/tmp/auth \
    --env DISPLAY \
    --env XAUTHORITY=/tmp/auth \
    --env NVIDIA_VISIBLE_DEVICES=all \
    --env NVIDIA_DRIVER_CAPABILITIES=display \
    x11docker/xfce glxgears -info

So podman rootless works with a nvidia discrete gpu, but doesn't work with an intel integrated gpu.

@mviereck
Copy link
Owner

mviereck commented Oct 28, 2020

With docker --gpu works well, but not with podman.
There is an issue with podman to set up /etc/passwd in container. The container user setup done by x11docker fails partially because podman locks the file and disallows write access.
To access the GPU the container user must be member of groups video and render. This is also done with option --group-add. GPU access fails nonetheless.
I assume here is the underlying issue why the combination --podman --gpu does not work yet.

Until the /etc/passwd issue in podman is fixed, I recommend to use --gpu with Docker only.

@mviereck mviereck changed the title --gpu fails to render a 3d program --gpu fails with --podman Oct 28, 2020
mviereck added a commit that referenced this issue Oct 28, 2020
@awerlang
Copy link
Author

I have a working podman rootless solution:

x11docker --podman --cap-default --gpu -- --privileged -- x11docker/xfce glxgears -info

It renders the animation using the gpu and prints information about the gpu, instead of the software stack (llvmpipe)

From man podman run:

   --privileged=true|false

   Give extended privileges to this container. The default is false.

   By default, Podman containers are unprivileged (=false) and cannot, for example, modify parts of the operating system.  This is because by default a container is only  allowed  limited  access  to  devices.   A
   "privileged" container is given the same access to devices as the user launching the container.

   A  privileged  container turns off the security features that isolate the container from the host. Dropped Capabilities, limited devices, read-only mount points, Apparmor/SELinux separation, and Seccomp filters
   are all disabled.

   Rootless containers cannot have more privileges than the account that launched them.

I attempted to add ´CAP_SYS_ADMIN´ but it wasn't enough. I'm unsure if it is possible to narrow down a specific privilege, but given I can use nvidia without --privileged it seems doable. Nevertheless it's safer than a rootful solution IMO.

@mviereck
Copy link
Owner

mviereck commented Oct 28, 2020

I have a working podman rootless solution:
x11docker --podman --cap-default --gpu -- --privileged -- x11docker/xfce glxgears -info
It renders the animation using the gpu and prints information about the gpu, instead of the software stack

Interesting, thank you! It is odd that --privileged helps because the gpu device files are shared already with --device '/dev/dri':'/dev/dri':rw --device '/dev/vga_arbiter':'/dev/vga_arbiter':rw.

I attempted to add ´CAP_SYS_ADMIN´ but it wasn't enough.

I assume that capabilities will make no difference. They only define what user root is allowed to do.
Maybe podman has some apparmor or seccomp setup that blocks access. You could try to add
--security-opt seccomp=unconfined --security-opt apparmor=unconfined instead of --privileged.
Example:

x11docker --podman --gpu --cap-default -- --security-opt seccomp=unconfined --security-opt apparmor=unconfined -- x11docker/xfce glxgears

If it works, also drop --cap-default.

@awerlang
Copy link
Author

Maybe podman has some apparmor or seccomp setup that blocks access. You could try to add
--security-opt seccomp=unconfined --security-opt apparmor=unconfined instead of --privileged.
Example:

x11docker --podman --gpu --cap-default -- --security-opt seccomp=unconfined --security-opt apparmor=unconfined -- x11docker/xfce glxgears

If it works, also drop --cap-default.

That didn't worked. This article discusses a few other differences, but I haven't got any other ideas.

A curious thing is that --hostdisplay doesn't make use of the Intel gpu. While on the nvidia system I can share the :0 display and it will pick up the nvidia gpu.

Feel free to close this issue if you feel the --privileged flag is a solution. I'll be able to continue troubleshooting the motivating issue from here. Thank you for your help!

@mviereck
Copy link
Owner

Feel free to close this issue if you feel the --privileged flag is a solution.

I'll leave the ticket open because I consider --privileged to be not a solution. It gives far too much privileges, and x11docker follows the principle of least privilege.
It might be ok for your use case to just check your own application, but it can't be recommended in general.

A curious thing is that --hostdisplay doesn't make use of the Intel gpu. While on the nvidia system I can share the :0 display and it will pick up the nvidia gpu.

The intel GPU does not work even with --privileged?
Do you run your nvidia card with closed source driver or with open source MESA driver/noveau?

This article discusses a few other differences,

A useful article, thank you for the link. It also provides a command to disable selinux for the container (another than that x11docker uses). That might be worth a try:

x11docker --podman --gpu --cap-default -- --security-opt label=disable --security-opt seccomp=unconfined --security-opt apparmor=unconfined -- x11docker/xfce glxgears

There are further options to track down the needed privileges, I'll look later again.

@awerlang
Copy link
Author

A curious thing is that --hostdisplay doesn't make use of the Intel gpu. While on the nvidia system I can share the :0 display and it will pick up the nvidia gpu.

The intel GPU does not work even with --privileged?

It doesn't.

Do you run your nvidia card with closed source driver or with open source MESA driver/noveau?

Closed source.

A useful article, thank you for the link. It also provides a command to disable selinux for the container (another than that x11docker uses). That might be worth a try:

x11docker --podman --gpu --cap-default -- --security-opt label=disable --security-opt seccomp=unconfined --security-opt apparmor=unconfined -- x11docker/xfce glxgears

Yes, but I don't have SELinux.

@mviereck
Copy link
Owner

The intel GPU does not work even with --privileged?

It doesn't.

In that case there is not much left how to fix this.
I can only imagine that the container user groups somehow are not applied although the output of id looks right, or that a driver is missing. Did you try with x11docker/xfce glxgears? It contains the free MESA drivers.
Note that for tests on intel it must not be an nvidia image and the nvidia container runtime must not be installed.

I tried --podman here with an AMD GPU and --privileged, it fails, too. glxgears gives the messages:

libGL error: MESA-LOADER: failed to retrieve device information
libGL error: image driver extension not found
libGL error: failed to load driver: radeon
1459 frames in 5.0 seconds = 291.701 FPS
...

With docker everything works well.

On the nvidia machine, is the nvidia container runtime installed?

@awerlang
Copy link
Author

The intel GPU does not work even with --privileged?

It doesn't.

In that case there is not much left how to fix this.
I can only imagine that the container user groups somehow are not applied although the output of id looks right, or that a driver is missing. Did you try with x11docker/xfce glxgears? It contains the free MESA drivers.
Note that for tests on intel it must not be an nvidia image and the nvidia container runtime must not be installed.

Yup, that's the image I used.

On the nvidia machine, is the nvidia container runtime installed?

Not the runtime (e.g. no --runtime=nvidia), but the hook from nvidia-container-toolkit instead that's triggered by the NVIDIA_VISIBLE_DEVICES variable. On my own images with VS Code and Chromium, Mesa had to be available as well.

@mviereck
Copy link
Owner

mviereck commented Oct 29, 2020

So I am running out of ideas here.
Overall I rather wait for podman to get some fixes in the future and leave the --podman option in undocumented experimental state.
I'll leave the ticket open. If you have new ideas or make new findings, I am happy to hear about it.

@mviereck
Copy link
Owner

mviereck commented Oct 30, 2020

I just found that there is no /dev/dri in a podman container at all, even with --privileged.
Of course, GPU acceleration must fail without the device files.

A very odd podman bug.

Edit: I found that podman shares the device file with --volume /dev/dri:/dev/dri:rw. Finally the device files appear in the container. Adding --privileged, too, finally allows GPU access.
What a mess.

@awerlang
Copy link
Author

I guess it depends on the vendor:

  • nvidia with nvidia-runtime-toolkit, without --privileged: there's no /dev/dri mapped;
  • intel, with --privileged --device: there's only /dev/dri/renderD12{8,9}.

Also on my systems it doesn't require container user to have {video,render} groups. I suppose it's udev/ACL rules in play on the host.

@mviereck
Copy link
Owner

mviereck commented Oct 30, 2020

This very privileged setup fails: (Covers all aspects of https://www.redhat.com/sysadmin/privileged-flag-container-engines, should replace --privileged)

x11docker --podman --gpu --cap-default --hostnet --hostipc --  --volume /dev/dri:/dev/dri:rw --security-opt label=disable --security-opt seccomp=unconfined --cap-add all --uts=host --pid=host  -- x11docker/xfce glxgears

This one works:

x11docker --podman --gpu --  --volume /dev/dri:/dev/dri:rw --privileged  -- x11docker/xfce glxgears

intel, with --privileged --device: there's only /dev/dri/renderD12{8,9}.

Odd. Here (AMD) no /dev/dri appears at all in a podman container, neither with --privileged nor with --device. Only --volume works, and for access --privileged is needed, too.

Also on my systems it doesn't require container user to have {video,render} groups. I suppose it's udev/ACL rules in play on the host.

Here at least the container user needs the groups.

@awerlang
Copy link
Author

awerlang commented Nov 2, 2020

This very privileged setup fails: (Covers all aspects of https://www.redhat.com/sysadmin/privileged-flag-container-engines, should replace --privileged)

x11docker --podman --gpu --cap-default --hostnet --hostipc --  --volume /dev/dri:/dev/dri:rw --security-opt label=disable --security-opt seccomp=unconfined --cap-add all --uts=host --pid=host  -- x11docker/xfce glxgears

The article mention masked paths, a more recent release (2.1) masks /sys/dev as well, and from a few tests I've run I can tell it needs to be available inside the container. There's discussion at containers/podman#7801.

To workaround the issue, no flag other than --privileged is able to make it work. Masked paths are applied after mounts, and only when unprivileged.

@mviereck
Copy link
Owner

mviereck commented Nov 4, 2020

Thank you for sharing this research! So it is basically a --device issue of podman that it does not provide the specified device. Let's hope that this will be fixed.

mviereck added a commit that referenced this issue Nov 6, 2020
--podman -gpu --alsa: share devices with --volume #293 #255
@mviereck
Copy link
Owner

mviereck commented Nov 6, 2020

As a workaround, x11docker now shares the device files with --volume and gives a note that one has to add --privileged, too.

@mviereck
Copy link
Owner

As a workaround, x11docker now shares the device files with --volume and gives a note that one has to add --privileged, too.

I've removed the workaround. A fix is at work at podman.
Currently podman fails with x11docker due to new introduced podman bugs.
Closing here, please follow up in #255 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants