Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error dumping a file-handle on AUFS (WONTFIX for now) #284

Closed
iprimo opened this issue Feb 15, 2017 · 13 comments
Closed

Error dumping a file-handle on AUFS (WONTFIX for now) #284

iprimo opened this issue Feb 15, 2017 · 13 comments

Comments

@iprimo
Copy link

iprimo commented Feb 15, 2017

Hi,
I am trying to C\R a container running a GUI app and the following error was observed. (dump file attached)
I appreciate if you could help.
Thanks

root@ubuntu01:/home/administrator# docker checkpoint create abcd_abcd chp_01
Error response from daemon: Cannot checkpoint container abcd_abcd: rpc error: code = 2 desc = exit status 1: "criu failed: type NOTIFY errno 0\nlog file: /var/lib/docker/containers/dc2e0c30f22ca879615c50c02b205e7d63135de97dab600fa445c96307155422/checkpoints/chp_01/criu.work/dump.log\n"
root@ubuntu01:/home/administrator#

root@ubuntu01:/home/administrator# docker --version
Docker version 1.13.1, build 092cba3

root@ubuntu01:/home/administrator# criu --version
Version: 2.11

dump.txt

@xemul
Copy link
Member

xemul commented Feb 16, 2017

Here's the failing piece:

(00.717981) sk unix:    Dumping extern: ino 0x5f19 peer_ino 0x5f1a family    1 type    1 state  1 name (null)
(00.717998) sk unix:    Dumped extern: id 0x42 ino 0x5f19 peer 0 type 2 state 10 name 0 bytes
(00.718007) sk unix:    Ext stream not supported: ino 0x5f19 peer_ino 0x5f1a family    1 type    1 state  1 name (null)
(00.718011) Error (criu/sk-unix.c:710): sk unix: Can't dump half of stream unix connection.

this means, that there's a unix.stream socket in your container, whose peer is not owned by any process in the same container.

Let's explore this? Can you show the output of 'ps xaf' and 'ss -xanp' inside the container you dump?

@iprimo
Copy link
Author

iprimo commented Feb 16, 2017

@xemul thanks for your reply.
Please find the output in the following:
Thanks


root@ubuntu01:/home/administrato# docker exec abcd_abcd ps xaf
  PID TTY      STAT   TIME COMMAND
  176 ?        Rs     0:00 ps xaf
   20 ?        Sl     0:03 /usr/bin/python /usr/bin/xpra start :29 --start=firefox --bind-tcp=0.0.0.0:23
   21 ?        Ss     0:01  \_ Xvfb-for-Xpra-:29 +extension Composite -screen 0 5760x2560x24+32 -dpi 96 -nolisten tcp -noreset -auth /root/.Xauthority :29
   76 ?        S      0:00  \_ /bin/sh -c firefox
   77 ?        Sl     0:09      \_ /usr/lib/firefox/firefox
    1 ?        Ss     0:00 /bin/bash
   30 ?        Ss     0:00 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
  105 ?        Sl     0:00 /usr/lib/at-spi2-core/at-spi-bus-launcher

root@ubuntu01:/home/administrator# docker exec abcd_abcd ss -xanp
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "exec: "ss": executable file not found in $PATH"

@xemul
Copy link
Member

xemul commented Feb 16, 2017

@iprimo , would you use nsenter to run ss, not docker? As there's no ss inside the container, it just fails, so you'd need to nsenter the target container's net namespace only.

@iprimo
Copy link
Author

iprimo commented Feb 16, 2017

@xemul, not sure if this can help.

root@ubuntu01:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
701f4ca723c2 iprimo/guiossh_08:v0.0003_firefox "/bin/bash" 45 minutes ago Up 45 minutes 0.0.0.0:2222->22/tcp, 0.0.0.0:2223->23/tcp abcd_abcd

root@ubuntu01:~# PID=$(docker inspect --format {{.State.Pid}} 701f4ca723c2)

root@ubuntu01:~# echo $PID
3264

root@ubuntu01:~# nsenter --target $PID --mount --uts --ipc --net --pid
mesg: ttyname failed: No such file or directory

root@701f4ca723c2:/# ss -xanp
-bash: ss: command not found

root@701f4ca723c2:/# ps xaf
PID TTY STAT TIME COMMAND
283 ? S 0:00 -bash
290 ? R+ 0:00 _ ps xaf
20 ? Rl 0:05 /usr/bin/python /usr/bin/xpra start :29 --start=firefox --bind-tcp=0.0.0.0:23
21 ? Ss 0:01 _ Xvfb-for-Xpra-:29 +extension Composite -screen 0 5760x2560x24+32 -dpi 96 -nolisten tcp -noreset -auth /root/.Xauthori
76 ? S 0:00 _ /bin/sh -c firefox
77 ? Sl 0:25 _ /usr/lib/firefox/firefox
1 ? Ss 0:00 /bin/bash
30 ? Ss 0:00 /usr/bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
105 ? Sl 0:00 /usr/lib/at-spi2-core/at-spi-bus-launcher
root@701f4ca723c2:/#

@xemul
Copy link
Member

xemul commented Feb 16, 2017

@iprimo , try not to enter the mount namespace, so that the fs view is still from host and ss command can be found, and pid, so that pids reported by the kernel can be found in host proc.

@iprimo
Copy link
Author

iprimo commented Feb 16, 2017

@xemul please find the capture filed attached.

file.txt

@xemul
Copy link
Member

xemul commented Feb 16, 2017

@iprimo , it looks like you haven't entered netnamespace, as there are no sockets of your container in the output. Can you set up a remote access to me, so I could look at this myself? Or even better -- give me a container itself. Or maybe a VM image with the problematic container inside.

@xemul
Copy link
Member

xemul commented Feb 16, 2017

So, I've attached to the node you have and here's what I see.

Dump fails for different reason -- it fails to parse the contents of the /proc/pid/fdinfo/nr entry describing an inotify. And this is due to kernel was unable to encode a file handle (look at dmesg | fgrep -i handle, it will say that encoding a handle went with error). And, in turn, inability to encode a handle comes from the fact, that the root FS for your container is AUFS. It's an obsoleted FS which doesn't support inotifies and file-handles itself.

Thus, in order to proceed with it, you need to switch your Docker environment to start using at least OverlayFS. It has issues with inotifies too, but we'd be able to work with them, unlike AUFS which is a dead-end.

@xemul xemul changed the title unix socket - type NOTIFY errno Error dumping a file-handle on AUFS (WONTFIX for now) Feb 16, 2017
@xemul xemul added the wontfix label Feb 16, 2017
@iprimo
Copy link
Author

iprimo commented Feb 17, 2017

@xemul thanks for the great work.
I reinstall docker and get back to you.

@xemul
Copy link
Member

xemul commented Feb 17, 2017

@iprimo , you're welcome :) AFAIK there's no need in reinstalling Docker, the use of aufs/overlayfs is controlled by some config file. But it's up to you.

@iprimo
Copy link
Author

iprimo commented Feb 18, 2017

@xemul, as was advised, I changed the storage driver to OverlayFS. The error has changed to the following:

Error (criu/irmap.c:86): irmap: Can't stat /no-such-path: No such file or directory
Error (criu/fsnotify.c:285): fsnotify: Can't dump that handle

Dump files attached.

docker spec.txt
dump.log.txt

@xemul
Copy link
Member

xemul commented Feb 20, 2017

@iprimo OK, that's much better :) Can you let me on your box again so I could check one more thing?

@xemul
Copy link
Member

xemul commented Mar 28, 2017

dup of #136

@xemul xemul closed this as completed Mar 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants