Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: Container won't start #290

Closed
Azkali opened this issue Oct 20, 2020 · 30 comments
Closed

aarch64: Container won't start #290

Azkali opened this issue Oct 20, 2020 · 30 comments

Comments

@Azkali
Copy link

Azkali commented Oct 20, 2020

Platform: Jetson-TX1
Architecture: aarch64
OS: Ubuntu 20.10
Docker version:

Client:
 Version:           19.03.13
 API version:       1.40
 Go version:        go1.13.8
 Git commit:        4484c46
 Built:             Thu Oct 15 18:35:10 2020
 OS/Arch:           linux/arm64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.13
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.8
  Git commit:       4484c46
  Built:            Wed Oct 14 13:25:32 2020
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.3.7-0ubuntu3
  GitCommit:        
 runc:
  Version:          spec: 1.0.1-dev
  GitCommit:        
 docker-init:
  Version:          0.18.0
  GitCommit:        

Hi, I'm trying to get x11docker and pass through the GPU, without using nvidia's runtime.

My test have been successful outside of x11docker, and the GPU is correctly being pass through/used inside the docker containers I used.

Working test using docker only :

docker run --rm -it --device=/dev/nvhost-ctrl \
--device=/dev/nvhost-ctrl-gpu \
--device=/dev/nvhost-prof-gpu \
--device=/dev/nvmap \
--device=/dev/nvhost-gpu \
--device=/dev/nvhost-as-gpu 
-e DISPLAY=${DISPLAY} \
--network=host \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra \
x11docker/lxde

Note: I'm using x11docker-Dockerfiles but built for aarch64

x11docker command :

x11docker --xpra --debug -- "--device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra" x11docker/lxde 

debug log from x11docker :

DEBUGNOTE[19:37:17,189]: check_host(): ps can watch root processes: yes
DEBUGNOTE[19:37:17,319]: host user: azkali 1000:1000 /home/azkali
x11docker WARNING: User azkali is member of group docker.
  That allows unprivileged processes on host to gain root privileges.

DEBUGNOTE[19:37:18,011]: storeinfo(): cache=/home/azkali/.cache/x11docker/x11docker-lxde-15436602102
DEBUGNOTE[19:37:18,040]: storeinfo(): stdout=/home/azkali/.cache/x11docker/x11docker-lxde-15436602102/share/stdout
DEBUGNOTE[19:37:18,070]: storeinfo(): stderr=/home/azkali/.cache/x11docker/x11docker-lxde-15436602102/share/stderr
DEBUGNOTE[19:37:18,151]: storeinfo(): x11dockerpid=89514
DEBUGNOTE[19:37:18,566]: 
x11docker version: 6.6.3-beta
docker version:    Docker version 19.03.13, build 4484c46
Host system:       "Ubuntu 20.10"
Host architecture: arm64v8 (aarch64)
Command:           '/usr/bin/x11docker' '--xpra' '--debug' '--' '--device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra' 'x11docker/lxde' 
Parsed options:     --xpra --debug -- '--device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra' 'x11docker/lxde'
DEBUGNOTE[19:37:24,887]: Dependency check for --xpra: 0
DEBUGNOTE[19:37:24,902]: Dependencies of --xpra already checked: 0 
DEBUGNOTE[19:37:24,916]: Dependencies of --xpra already checked: 0 
DEBUGNOTE[19:37:24,929]: Dependencies of --xpra already checked: 0 
DEBUGNOTE[19:37:24,941]: storeinfo(): xserver=--xpra
x11docker note: Option --xpra: If you encounter issues with xpra, 
  you can try --nxagent instead.
  Rather use xpra from www.xpra.org than from distribution repositories.

x11docker WARNING: Found custom DOCKER_RUN_OPTIONS.
  x11docker will add them to 'docker run' command without
  a serious check for validity or security. Found options:
   --device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra

DEBUGNOTE[19:37:25,049]: container user: azkali 1000:1000 /home/azkali
DEBUGNOTE[19:37:25,156]: waitforlogentry(): tailstdout: Waiting for logentry "x11docker=ready" in store.info
DEBUGNOTE[19:37:25,161]: waitforlogentry(): tailstderr: Waiting for logentry "x11docker=ready" in store.info
DEBUGNOTE[19:37:25,250]: storepid(): Stored pid '90435' of 'watchpidlist':   90435 pts/1    00:00:00 bash
DEBUGNOTE[19:37:25,382]: storepid(): Stored pid '90454' of 'watchmessagefifo':   90454 pts/1    00:00:00 bash
DEBUGNOTE[19:37:25,675]: storeinfo(): DISPLAY=:126
DEBUGNOTE[19:37:25,698]: storeinfo(): XAUTHORITY=/home/azkali/.cache/x11docker/x11docker-lxde-15436602102/share/Xauthority.client
DEBUGNOTE[19:37:25,726]: storeinfo(): XSOCKET=/tmp/.X11-unix/X126
DEBUGNOTE[19:37:25,748]: storeinfo(): XDG_RUNTIME_DIR=/run/user/1000
DEBUGNOTE[19:37:25,775]: storeinfo(): Xenv= DISPLAY=:126 XAUTHORITY=/home/azkali/.cache/x11docker/x11docker-lxde-15436602102/share/Xauthority.client XSOCKET=/tmp/.X11-unix/X126 XDG_RUNTIME_DIR=/run/user/1000
DEBUGNOTE[19:37:26,378]: Xpra server command:
  xpra start :126 --use-display \
  --csc-modules=none \
  --encodings=rgb \
  --microphone=no \
  --notifications=no \
  --pulseaudio=no \
  --socket-dirs='/home/azkali/.cache/x11docker/x11docker-lxde-15436602102' \
  --speaker=no \
  --start-via-proxy=no \
  --webcam=no \
  --xsettings=no \
  --clipboard=yes\
  --dbus-proxy=no \
  --daemon=no \
  --fake-xinerama=no \
  --file-transfer=off \
  --html=off \
  --opengl=noprobe \
  --mdns=no \
  --printing=no \
  --session-name='x11docker-lxde' \
  --start-new-commands=no \
  --systemd-run=no \
  --video-encoders=none \
  --dpi='75'
DEBUGNOTE[19:37:26,598]: Xpra client command:
  xpra attach :126 \
  --csc-modules=none \
  --encodings=rgb \
  --microphone=no \
  --notifications=no \
  --pulseaudio=no \
  --socket-dirs='/home/azkali/.cache/x11docker/x11docker-lxde-15436602102' \
  --speaker=no \
  --start-via-proxy=no \
  --webcam=no \
  --xsettings=no \
  --clipboard=no \
  --compress=0 \
  --modal-windows=no \
  --opengl=auto \
  --quality=100 \
  --video-decoders=none \
  --title='@title@ [in container]'
DEBUGNOTE[19:37:26,627]: X server command:
  /usr/bin/Xvfb :126  \
  -retro \
  +extension RANDR \
  +extension RENDER \
  +extension GLX \
  +extension XVideo \
  +extension DOUBLE-BUFFER \
  +extension SECURITY \
  +extension DAMAGE \
  +extension X-Resource \
  -extension XINERAMA -xinerama \
  -extension MIT-SHM \
  +extension Composite +extension COMPOSITE \
  +extension XTEST \
  -dpms \
  -s off \
  -auth /home/azkali/.cache/x11docker/x11docker-lxde-15436602102/Xauthority.server \
  -nolisten tcp \
  -dpi 75 \
  -screen 0 720x1280x24
DEBUGNOTE[19:37:27,349]: storeinfo(): tini=/usr/bin/docker-init
DEBUGNOTE[19:37:27,425]: Users and terminal:
  x11docker was started by:                       azkali
  As host user serves (running X, storing cache): azkali
  Container user will be:                         azkali
  Container user password:                        x11docker
  Getting permission to run docker with:          eval 
  Terminal for password frontend:                 bash -c
  Running in a terminal:                          yes
  Running on console:                             no
  Running over SSH:                               no
  Running sourced:                                no
  bash $-:                                        huBE
DEBUGNOTE[19:37:27,459]: storeinfo(): containername=x11docker_X126_x11docker-lxde_15436602102
DEBUGNOTE[19:37:28,741]: Docker command:
  docker run --tty --rm --detach \
  --name x11docker_X126_x11docker-lxde_15436602102 \
  --user 1000:1000 \
  --env USER=azkali \
  --userns host \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  --security-opt label=type:container_runtime_t \
  --volume '/usr/bin/docker-init':'/usr/local/bin/init':ro \
  --tmpfs /run --tmpfs /run/lock \
  --volume '/home/azkali/.cache/x11docker/x11docker-lxde-15436602102/share':'/x11docker':rw \
  --volume '/tmp/.X11-unix/X126':'/X126':rw \
  --workdir '/tmp' \
  --entrypoint env \
  --env 'container=docker' \
  --env 'NO_AT_BRIDGE=1' \
  --env 'GTK_CSD=0' \
  --env 'GTK_OVERLAY_SCROLLING=0' \
  --env 'MWWM=allwm' \
  --env 'MWNO_RIT=true' \
  --env 'MWNOCAPTURE=true' \
  --env 'QT_X11_NO_NATIVE_MENUBAR=1' \
  --env 'UBUNTU_MENUPROXY=' \
  --env 'XAUTHORITY=/x11docker/Xauthority.client' \
  --env 'DISPLAY=:126' \
  --env 'HOME=/home/azkali' \
  --env 'XDG_RUNTIME_DIR=/tmp/XDG_RUNTIME_DIR' \
   --device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra \
  -- x11docker/lxde /usr/local/bin/init -- /bin/sh - /x11docker/containerrc
DEBUGNOTE[19:37:30,024]: storepid(): Stored pid '91462' of 'containershell':   91462 pts/1    00:00:00 bash
DEBUGNOTE[19:37:30,034]: Running xtermrc: Ask for password if needed (no)
DEBUGNOTE[19:37:30,101]: waitforlogentry(): start_xserver(): Waiting for logentry "readyforX=ready" in store.info
DEBUGNOTE[19:37:30,162]: Running dockerrc: Setup as root or as user docker on host.
DEBUGNOTE[19:37:30,276]: dockerrc: Found default Runtime: runc
DEBUGNOTE[19:37:30,344]: dockerrc: All  Runtimes: nvidia runc
DEBUGNOTE[19:37:30,421]: dockerrc: Container Runtime: runc
DEBUGNOTE[19:37:30,486]: storeinfo(): runtime=runc
DEBUGNOTE[19:37:30,721]: dockerrc: Image architecture: arm64
DEBUGNOTE[19:37:30,764]: dockerrc: Image CMD: /usr/local/bin/start
DEBUGNOTE[19:37:30,820]: dockerrc: Image USER: 
DEBUGNOTE[19:37:30,859]: storeinfo(): containeruser=azkali
DEBUGNOTE[19:37:30,902]: dockerrc: Image ENTRYPOINT: 
DEBUGNOTE[19:37:30,944]: dockerrc: Image WORKDIR: 
DEBUGNOTE[19:37:30,982]: storeinfo(): readyforX=ready
DEBUGNOTE[19:37:31,018]: waitforlogentry(): dockerrc: Waiting for logentry "xinitrc is ready" in xinit.log
DEBUGNOTE[19:37:31,211]: waitforlogentry(): start_xserver(): Found log entry "readyforX=ready" in store.info.
DEBUGNOTE[19:37:31,235]: waitforlogentry(): xpra: Waiting for logentry "xinitrc=ready" in store.info
DEBUGNOTE[19:37:31,265]: storepid(): Stored pid '91890' of 'xpraloop':   91890 pts/1    00:00:00 bash
DEBUGNOTE[19:37:33,555]: Running xinitrc
DEBUGNOTE[19:37:33,907]: xinitrc: Created cookie: linux/unix:126  MIT-MAGIC-COOKIE-1  00996254f0a0d224db225cc4134f79f8 
#ffff#6c696e7578#:126  MIT-MAGIC-COOKIE-1  00996254f0a0d224db225cc4134f79f8
DEBUGNOTE[19:37:34,936]: waitforlogentry(): xpra: Found log entry "xinitrc=ready" in store.info.
DEBUGNOTE[19:37:34,953]: storeinfo(): xinitrc=ready
DEBUGNOTE[19:37:34,962]: Running Xpra server
DEBUGNOTE[19:37:35,101]: storepid(): Stored pid '92187' of 'xpraserver':   92187 pts/1    00:00:00 bash
DEBUGNOTE[19:37:35,171]: Running Xpra client
DEBUGNOTE[19:37:35,171]: waitforlogentry(): dockerrc: Found log entry "xinitrc is ready" in xinit.log.
DEBUGNOTE[19:37:35,306]: storepid(): Stored pid '92253' of 'xpraclient':   92253 pts/1    00:00:00 bash
DEBUGNOTE[19:37:37,116]: waitforlogentry(): tailstderr: Waiting since 11s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:37,156]: waitforlogentry(): tailstdout: Waiting since 11s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:38,175]: waitforlogentry(): tailstderr: Waiting since 12s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:38,209]: waitforlogentry(): tailstdout: Waiting since 12s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:39,220]: waitforlogentry(): tailstderr: Waiting since 13s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:39,260]: waitforlogentry(): tailstdout: Waiting since 13s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:40,209]: storeinfo(): containerid=5e0869613396b1353a85540262b7cf2c61f4ab13702b7f51a9aee3b5df57185e
DEBUGNOTE[19:37:40,277]: waitforlogentry(): containerrc: Waiting for logentry "containerrootrc=ready" in store.info
DEBUGNOTE[19:37:40,281]: waitforlogentry(): tailstderr: Waiting since 14s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:40,330]: waitforlogentry(): tailstdout: Waiting since 14s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:41,040]: dockerrc: Container is up and running.
DEBUGNOTE[19:37:41,190]: dockerrc: 1. check for PID 1: 92377
DEBUGNOTE[19:37:41,246]: storeinfo(): pid1pid=92377
DEBUGNOTE[19:37:41,364]: storeinfo(): containerip=172.17.0.2
DEBUGNOTE[19:37:41,432]: waitforlogentry(): start_docker(): Waiting for logentry "dockerrc=ready" in store.info

x11docker ERROR: Got error message from docker daemon: 
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown 
 
  Last lines of logfile: 

  Type 'x11docker --help' for usage information
  Debug options: '--verbose' (full log) or '--debug' (log excerpt).
  Logfile will be: /home/azkali/.cache/x11docker/x11docker.log
  Please report issues at https://github.com/mviereck/x11docker

DEBUGNOTE[19:37:41,453]: time to say goodbye (error)
DEBUGNOTE[19:37:41,483]: storeinfo(): error=64
DEBUGNOTE[19:37:41,515]: time to say goodbye (finish-subshell)
DEBUGNOTE[19:37:41,987]: waitforlogentry(): start_docker(): Stopped waiting for dockerrc=ready in store.info due to terminating signal.
DEBUGNOTE[19:37:42,013]: traperror: Command at Line 6127 returned with error code 1:
  return 1
  8722 - ::start_docker::main::main
DEBUGNOTE[19:37:42,047]: storeinfo(): error=64
DEBUGNOTE[19:37:42,113]: time to say goodbye (traperror)
DEBUGNOTE[19:37:42,207]: time to say goodbye (watchpidlist)
DEBUGNOTE[19:37:42,333]: traperror: Command at Line 8722 returned with error code 1:
  return 1
  8789 - ::main::main
DEBUGNOTE[19:37:42,352]: storeinfo(): error=64
DEBUGNOTE[19:37:42,369]: waitforlogentry(): tailstderr: Waiting since 15s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:42,409]: waitforlogentry(): tailstdout: Waiting since 15s for log entry "x11docker=ready" in store.info
DEBUGNOTE[19:37:42,417]: time to say goodbye (traperror)
DEBUGNOTE[19:37:42,423]: waitforlogentry(): tailstderr: Stopped waiting for x11docker=ready in store.info due to terminating signal.
^Cx11docker/lxde: If the panel does not show an approbate menu
  and you encounter high CPU usage (seen with kata-runtime),
  please run with option --init=systemd.

mkdir: cannot create directory '/home/azkali': Permission denied
cp: cannot create regular file '/home/azkali/.config/openbox/lxde-rc.xml': No such file or directory
mkdir: cannot create directory '/home/azkali': Permission denied
mkdir: cannot create directory '/home/azkali': Permission denied
** Message: 19:37:42.317: main.vala:101: Session is LXDE
** Message: 19:37:42.317: main.vala:102: DE is LXDE

(lxsession:66): Gtk-WARNING **: 19:37:42.327: cannot open display: :126
DEBUGNOTE[19:37:42,451]: time to say goodbye (xpra)
DEBUGNOTE[19:37:42,467]: watchpidlist(): Setting pid 92377 on watchlist: pid1pid
DEBUGNOTE[19:37:42,497]: Received SIGINT
DEBUGNOTE[19:37:42,542]: waitforlogentry(): tailstdout: Stopped waiting for x11docker=ready in store.info due to terminating signal.
DEBUGNOTE[19:37:42,559]: storeinfo(): error=130
DEBUGNOTE[19:37:42,636]: Terminating x11docker.
DEBUGNOTE[19:37:42,642]: storepid(): Stored pid '92377' of 'pid1pid': 
DEBUGNOTE[19:37:42,662]: time to say goodbye (finish)
DEBUGNOTE[19:37:42,793]: finish(): Checking pid 92377 (pid1pid): (already gone)
DEBUGNOTE[19:37:42,874]: finish(): Checking pid 92253 (xpraclient): (already gone)
DEBUGNOTE[19:37:43,000]: finish(): Checking pid 92187 (xpraserver): (already gone)
DEBUGNOTE[19:37:43,099]: finish(): Checking pid 91890 (xpraloop): (already gone)
DEBUGNOTE[19:37:43,202]: finish(): Checking pid 91462 (containershell):   91462 pts/1    00:00:00 bash
DEBUGNOTE[19:37:43,296]: termpid(): Terminating 91462 (containershell):   91462 pts/1    00:00:00 bash

Complete log file: x11docker.log

@johncm
Copy link

johncm commented Oct 20, 2020 via email

@Azkali
Copy link
Author

Azkali commented Oct 20, 2020

No, it is right, you're mistaking the docker run test and the x11docker full log that includes x11docker command ( first line ).
Just updated it to be clearer.
Edit: complete log file was wrong though, it was using --no-setup option, just updated it using the same x11docker command used above

@mviereck mviereck added the bug label Oct 20, 2020
@mviereck
Copy link
Owner

Thank you for the report.
I am not sure what is happening. The core error message is:

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown 

It is quite unlikely that there is no sh in the image. Please try if you get an interactive sh shell in container with this command:

x11docker -ti x11docker/lxde sh

I found that you use Ubuntu 20.10 on the host. Ubuntu uses snap instead of apt in some versions. That already caused several issues in the past. Is your docker installed with snap or apt?
Please try with --cap-default and/or --no-setup if you get different error messages.

The device files and -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra should cause no issue here. However, please try without those.

@Azkali
Copy link
Author

Azkali commented Oct 20, 2020

You're welcome, thank you for your quick answer.

I'm using docker.io package from apt.

x11docker -ti x11docker/lxde sh ( it hangs doing nothing, but another error pops up see log file ): x11docker.log

Using --no-setup or --cap-default casue the same issue. (OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown)

Removing both volumes in every cases cause the same issue too.

@mviereck
Copy link
Owner

Really odd.
The new error is

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "open /dev/ptmx: no such device": unknown

Does it work running docker directly? Please try:

docker run --rm -ti x11docker/lxde sh

and

docker run --rm -ti --user=1000:1000 x11docker/lxde sh

and

docker run --rm -ti --user=1000:1000 --cap-drop=ALL x11docker/lxde sh

@Azkali
Copy link
Author

Azkali commented Oct 20, 2020

All of the three commands worked perfectly ! It is indeed strange.. I am clueless at the moment

@mviereck
Copy link
Owner

I am clueless at the moment

Me too ...
Maybe an issue with docker-init? Try:

x11docker -ti --user=root --cap-default --init=none x11docker/lxde sh

and:

docker run --rm -ti --init  x11docker/lxde sh

@Azkali
Copy link
Author

Azkali commented Oct 20, 2020

With x11docker :

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "open /dev/ptmx: no such device": unknown

Docker alone works here again

@mviereck
Copy link
Owner

Currently I am out of ideas. Just waiting for inspiration yet.

Unrelated to the issue itself: I've added the device files /dev/nvhost* and /dev/nvmap to x11docker option --gpu. Only the driver itself is not shared because that is rather unreliable. It is better to provide the driver in other ways.

@mviereck
Copy link
Owner

Maybe try another image? This one is reported to work on arm64:

x11docker aptman/dbhi:bionic-octave octave

@Azkali
Copy link
Author

Azkali commented Oct 20, 2020

Thanks a lot for integrating the devices ! ( sure, it was quicker for testing to pass the drivers as is )
I am testing the above image as we speak. The device share wet flawlessly using --gpu option.
If I found anything new I'll report that here. Thanks for your time !

@mviereck
Copy link
Owner

I've combined the generated docker command with your working example:

  docker run --tty --rm --detach \
  --name x11docker_X127_x11docker-lxde_17663627356 \
  --user 1000:1000 \
  --env USER=azkali \
  --userns host \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  --security-opt label=type:container_runtime_t \
  --volume '/usr/bin/docker-init':'/usr/local/bin/init':ro \
  --tmpfs /run --tmpfs /run/lock \
  --volume '/tmp/.X11-unix/X127':'/X127':rw \
  --workdir '/tmp' \
  --entrypoint env \
  --env 'container=docker' \
  --env 'NO_AT_BRIDGE=1' \
  --env 'GTK_CSD=0' \
  --env 'GTK_OVERLAY_SCROLLING=0' \
  --env 'MWWM=allwm' \
  --env 'MWNO_RIT=true' \
  --env 'MWNOCAPTURE=true' \
  --env 'QT_X11_NO_NATIVE_MENUBAR=1' \
  --env 'UBUNTU_MENUPROXY=' \
  --env DISPLAY=$DISPLAY \
  --network=host \
  -v /tmp/.X11-unix:/tmp/.X11-unix \
  -- x11docker/lxde lxterminal

If this fails, the bug is encircled.

@Azkali
Copy link
Author

Azkali commented Oct 20, 2020

This still work ....

Also the above octave image uses weston by default with x11-backend.so which isn't ship in our BSP. ( Thanks Nvidia ..... )
So I tried -W but I couldn't have wayland to be used by my ubuntu.
I tried with --xorg and got the same issue as we ran into earlier.

@mviereck
Copy link
Owner

mviereck commented Oct 21, 2020

I found a bug in experimental option --no-setup.
In regular case x11docker runs a docker exec --user=root [...] sh into the container, mostly for container user setup.
--no-setup should suppress this behaviour, but due to a bug it did not correctly.
Maybe the sh error occurs in the docker exec command.

Please update and try:

x11docker -ti --no-setup --cap-default x11docker/lxde sh

Also the above octave image uses weston by default

I'll have a look to change this. You mean, x11docker uses weston for octave? Did you use option --gpu?

@Azkali
Copy link
Author

Azkali commented Oct 21, 2020

This worked 🙂
Thank you, I confirm that it launches X too !
If I see any new issues I'll let you know, closing the issue for now.

Edit: For the octave image, I used --gpu but the issue might surely be on my end ( concerning Wayland backend not working ) otherwise Tegra just missed x11-backend, it's not supplied by Nvidia in their BSP

@Azkali Azkali closed this as completed Oct 21, 2020
@mviereck
Copy link
Owner

This worked

Great!

If I see any new issues I'll let you know, closing the issue for now.

Using --no-setup is a workaround, the underlying bug is not fixed yet. So I'll reopen the ticket.

Could you run further tests?
The bug is likely in these lines / docker exec commands:

  [ "$Switchcontaineruser" = "no" ] && [ "$Containersetup" = "yes" ] && {
    echo "debugnote 'dockerrc(): Starting containerrootrc with privileged docker exec'"
    echo "# copy containerrootrc inside of container to avoid possible noexec of host home."
    echo "$Dockerexe exec --privileged --tty $Containername sh -c 'cp $(convertpath share $Containerrootrc) /tmp/containerrootrc ; chmod 644 /tmp/containerrootrc' 2>&1 | rmcr >>$Containerlogfile"
    echo "# run container root setup. containerrc will wait until setup script is ready."
    echo "$Dockerexe exec --privileged --tty -u root $Containername /bin/sh /tmp/containerrootrc 2>&1 | rmcr >>$Containerlogfile"
    echo ""
  }

Could you run a docker exec command in a running x11docker container?
For example, run:

x11docker --showid x11docker/lxde lxterminal

Option --showid prints the container id.
You could try

docker exec --privileged --tty -u root CONTAINERID sh -c "pstree"

For the octave image, I used --gpu but the issue might surely be on my end ( concerning Wayland backend not working ) otherwise Tegra just missed x11-backend, it's not supplied by Nvidia in their BSP

x11docker should not automatically use Wayland setups with --gpu if an NVIDIA card is present. In fact, NvIDIA GPU acceleration only works with --hostdisplay and --xorg.
I'll have a look at the x11docker default checks.

@mviereck mviereck reopened this Oct 22, 2020
@Azkali
Copy link
Author

Azkali commented Oct 23, 2020

Oh my bad, definitely.

Running --showid alone, gives the same error as before where the container couldn't find the executable path for sh.

With --no-setup and --showid same error.

With --no-setup, --showid, --gpu and --hostdisplay it works and spawns lxterminal, a log message appears, saying that the program is waiting for the container to stop in order to show the id of the container, so I've retrieved it using docker.

Attaching ( docker exec ... CONTAINERID ) to this x11docker-spawned container gives the same issue as previously with /dev/ptmx

@mviereck
Copy link
Owner

mviereck commented Oct 24, 2020

That sounds like a mess.
Can you please show me the full commands? I am a bit confused now. E.g., what means "--showid alone"? x11docker --showid? Or a full command with image name?

Attaching ( docker exec ... CONTAINERID ) to this x11docker-spawned container gives the same issue as previously with /dev/ptmx

Can you try variations of the docker exec command if something makes a difference? Changing options, using different commands.
For example:

docker exec  CONTAINERID pstree
docker exec  CONTAINERID sh -c "pstree"
docker exec --privileged -u root CONTAINERID sh -c "pstree"
docker exec  -u root CONTAINERID sh -c "pstree"

At least for the /dev/ptmx issue dropping --tty might make a difference.
Edit: I've removed the --tty option in docker exec in x11docker. Maybe that already makes a difference, maybe even fixes the bug. Please update and run a check without --no-setup.

Edit2: I've build an x11docker/lxde image based on arm64v8/debian, started with QEMU on amd64. I could not reproduce the issues.

@Azkali
Copy link
Author

Azkali commented Oct 24, 2020

First test, fails to spawn container, lxterminal does't start :

$ x11docker --showid x11docker/lxde lxterminal`

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown

Second test, same as above, lxterminal does't start :

$ x11docker --no-setup --showid x11docker/lxde lxterminal`

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown

Third test, which launch lxterminal and spawn the container, but doesn't show the ID in any case :

$ x11docker --no-setup --showid --gpu --hostdisplay x11docker/lxde lxterminal`

x11docker note: Option --no-setup: experimental only.

x11docker WARNING: User azkali is member of group docker.
  That allows unprivileged processes on host to gain root privileges.

x11docker WARNING: Option --gpu degrades container isolation.
  Container gains access to GPU hardware.
  This allows reading host window content (palinopsia leak)
  and GPU rootkits (compare proof of concept: jellyfish).

x11docker note: Option --gpu: To allow GPU acceleration with --hostdisplay,
  x11docker will allow trusted cookies.

x11docker note: Option --hostdisplay: To allow --hostdisplay with trusted cookies,
  x11docker must share host IPC namespace with container (option --hostipc)
  to allow shared memory for X extension MIT-SHM.

x11docker WARNING: Option --hostdisplay with trusted cookies provides
      QUITE BAD CONTAINER ISOLATION !
  Keylogging and controlling host applications is possible! 
  Clipboard sharing is enabled (option --cliboard).
  It is recommended to use another X server option like --xpra or --nxagent.

x11docker WARNING: Option --hostipc severely degrades 
  container isolation. IPC namespace remapping is disabled.

x11docker note: Option --init: Unknown init system 
  Possible: tini systemd sysvinit openrc runit s6-overlay none
  Fallback: Using --init=tini instead.

x11docker WARNING: Sharing device file: /dev/dri

x11docker WARNING: Sharing device file: /dev/nvhost-as-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-ctrl

x11docker WARNING: Sharing device file: /dev/nvhost-ctrl-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-ctrl-isp

x11docker WARNING: Sharing device file: /dev/nvhost-ctrl-isp.1

x11docker WARNING: Sharing device file: /dev/nvhost-ctrl-nvdec

x11docker WARNING: Sharing device file: /dev/nvhost-ctxsw-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-dbg-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-isp

x11docker WARNING: Sharing device file: /dev/nvhost-isp.1

x11docker WARNING: Sharing device file: /dev/nvhost-msenc

x11docker WARNING: Sharing device file: /dev/nvhost-nvdec

x11docker WARNING: Sharing device file: /dev/nvhost-nvjpg

x11docker WARNING: Sharing device file: /dev/nvhost-prof-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-sched-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-tsec

x11docker WARNING: Sharing device file: /dev/nvhost-tsecb

x11docker WARNING: Sharing device file: /dev/nvhost-tsg-gpu

x11docker WARNING: Sharing device file: /dev/nvhost-vic

x11docker WARNING: Sharing device file: /dev/nvmap


x11docker ERROR: Got error message from docker daemon: 
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown 
 
  Last lines of logfile: 

  Type 'x11docker --help' for usage information
  Debug options: '--verbose' (full log) or '--debug' (log excerpt).
  Logfile will be: /home/azkali/.cache/x11docker/x11docker.log
  Please report issues at https://github.com/mviereck/x11docker


(lxterminal:1): dbind-WARNING **: 10:12:52.343: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-AMcGzkfGpZ: Connection refused

** (lxterminal:1): WARNING **: 10:12:52.494: Bind on socket failed: No such file or directory


** (lxterminal:1): WARNING **: 10:12:52.497: Configuration file create failed: No such file or directory

DEBUGNOTE[12:12:53,184]: finish(): Container still running. Executing 'docker stop'.
  Will wait up to 15 seconds for docker to finish.
DEBUGNOTE[12:12:53,215]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:12:54,254]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:12:55,286]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:12:56,316]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:12:57,350]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:12:58,380]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:12:59,417]: finish(): Waiting for container to terminate ...
DEBUGNOTE[12:13:00,492]: finish(): Container terminated successfully
DEBUGNOTE[12:13:00,708]: x11docker exit code: 64

From there I did docker ps, copied the ID and :

$ docker exec --privileged --tty -u root 1f91e19d4cb1 sh -c "pstree"
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "open /dev/ptmx: no such device": unknown

I'll do the above test ASAP

@Azkali
Copy link
Author

Azkali commented Oct 27, 2020

Hum just saw your edits, I'm starting to think that somehow installing nvidia runtime messed up my whole docker package/configs. I'll re-flash and test everything again without installing nvidia runtime package.

@mviereck
Copy link
Owner

I'll re-flash and test everything again without installing nvidia runtime package.

That would be interesting. I have no good idea what is wrong yet.
The --tty changes might help, but that is rather a blind guess and likely only cover underlying issues if they help at all.

I'm starting to think that somehow installing nvidia runtime messed up my whole docker package/configs.

Ubuntu on arm might be another source of issues. Debian might be a better choice.

@Azkali
Copy link
Author

Azkali commented Oct 29, 2020

Thank you, I'll prepare a plain debian then for the next tests ( I am using this board as my daily driver so I need to find the good timing to re flash )

@mviereck
Copy link
Owner

Closing due to inactivity.
If you do further tests on this, we can reopen.

@Azkali
Copy link
Author

Azkali commented Nov 22, 2020

Hi, so I did further testing using arch Linux, yesterday, it happens that we where missing some kernel option related to namespace.

I did reproduce the same exact issues with the same commands in arch with the right options now enabled in our kernel, I think it should be good to share with you our kernel repository :
https://gitlab.com/switchroot/kernel/l4t-kernel-4.9/-/blob/linux-rel32-rebase/arch/arm64/configs/tegra_linux_defconfig

Sorry for not keeping you updated with this for a while.

@mviereck
Copy link
Owner

Thank you for the feedback! I'll reopen now.

@mviereck mviereck reopened this Nov 24, 2020
@mviereck
Copy link
Owner

Because the issue seems to boil down to docker exec issues I think it might be a problem with nsenter that allows processes to enter namespaces of other processes.
Maybe this has to be somehow allowed in your kernel configuration.

A short test without x11docker:

$ docker run --name mysleep alpine sleep 10 &
[1] 222869
$ docker exec mysleep ps aux
PID   USER     TIME  COMMAND
    1 root      0:00 sleep 10
    7 root      0:00 ps aux

I assume that docker exec uses nsenter respective the kernel feature behind nsenter.

@Azkali
Copy link
Author

Azkali commented Nov 27, 2020

Test result :

$ docker run --name mysleep alpine sleep 10 &
[1] 6456
$ docker exec mysleep ps aux
OCI runtime exec failed: exec failed: container_linux.go:370: starting container process caused exec: "ps": executable file not found in $PATH: unknown

@mviereck
Copy link
Owner

Ok, this indicates that there is an issue with the kernel.
I assume that your kernel configuration misses something to allow entering namespaces of other processes like nsenter or docker exec do.

@Azkali
Copy link
Author

Azkali commented Nov 27, 2020

Thank you, I'll take a look at what can be possibly missing and get back to you as soon as I have more insight on the kernel configuration.

@mviereck
Copy link
Owner

I think I can close here because it is not an x11docker bug but a kernel configuration issue and can be reproduced with docker exec alone.
A workaround is option --no-setup that avoids docker exec in x11docker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@johncm @Azkali @mviereck and others