Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot checkpoint container: rpc error: code = 2 desc = exit status 1: "criu failed: type NOTIFY errno 0" #587

Closed
wangzhao123456 opened this issue Dec 27, 2018 · 6 comments

Comments

@wangzhao123456
Copy link

Hi, everyone
I am trying to checkpoint/restore my container that is running google Chrome browser, and I got checkpoint problem.
The environment and error log is described as follows:

uname -a

Linux ubuntu 4.4.0-31-generic #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

docker version

Client:
Version: 17.06.0-ce
API version: 1.30
Go version: go1.8.3
Git commit: 02c1d87
Built: Fri Jun 23 21:19:16 2017
OS/Arch: linux/amd64

Server:
Version: 17.06.0-ce
API version: 1.30 (minimum version 1.12)
Go version: go1.8.3
Git commit: 02c1d87
Built: Fri Jun 23 21:17:13 2017
OS/Arch: linux/amd64
Experimental: true

docker info

Containers: 10
Running: 0
Paused: 0
Stopped: 10
Images: 5
Server Version: 17.06.0-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 49
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
apparmor
Kernel Version: 4.4.0-31-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 983.9MiB
Name: ubuntu
ID: GTKY:QTFV:SROI:BAMP:HX3E:IOWU:MWGB:3RMW:PS27:2V7I:JZSI:DOAG
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swam limit support

docker -version

Docker version 17.06.0-ce, build 02c1d87

sudo criu check

Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:791): sockets: The current kernel doesn't support ipv4 raw_diag module
Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:834): sockets: The current kernel doesn't support ipv6 raw_diag module
Looks good.

sudo criu check --all

Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:791): sockets: The current kernel doesn't support ipv4 raw_diag module
Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:834): sockets: The current kernel doesn't support ipv6 raw_diag module
Error (criu/cr-check.c:984): The TCP_REPAIR_WINDOW option isn't supported.
Error (criu/cr-check.c:928): TCP_REPAIR can't be enabled for half-closed sockets
Warn (criu/cr-check.c:1060): Do not have API to map vDSO - will use mremap() to restore vDSO
Error (criu/cr-check.c:1049): Non-cooperative UFFD is not supported
Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:791): sockets: The current kernel doesn't support ipv4 raw_diag module
Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:834): sockets: The current kernel doesn't support ipv6 raw_diag module
Error (criu/cr-check.c:821): autofs not supported.
Warn (criu/cr-check.c:1026): compat_cr is not supported. Requires kernel >= v4.12
Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.

I can successfully checkpoint/restore the simple looping container using the following commands:

docker run --name cr -d busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

docker checkpoint create cr checkpoint1

docker checkpoint ls cr

docker start --checkpoint checkpoint1 cr

I can run Chrome browser in container using the following commands:

docker run --name chrome-criu -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --net=host --privileged ansible/ubuntu14.04-ansible-chrome /usr/bin/google-chrome-stable --no-sandbox https://www.baidu.com

"ansible/ubuntu14.04-ansible-chrome" is the docker image with Google Chrome installed.
"/usr/bin/google-chrome-stable" is the executable program of Google Chrome.
"--no-sandbox https://www.baidu.com" are the command line options of Google Chrome.

I can successfully load and display the webpage "https://www.baidu.com".
Then I tried to checkpoint container chrome-criu using the following command:

docker checkpoint create chrome-criu checkpoint1

and I got the following error and the dump.log file is attached below:

Cannot checkpoint container chrome-criu: rpc error: code = 2 desc = exit status 1: "criu failed: type NOTIFY errno 0"

I searched some similar problems and solutions. I removed the "-it" options in "docker run" command. and I also tried to run criu as root.
but I still got the same problem. I watched the dump.log but I can't figure out what is going wrong and how to fix it.
I am looking for your help. Thanks in advance!

@wangzhao123456
Copy link
Author

dump.log
The above is the dump.log file . Thanks.

@rst0git
Copy link
Member

rst0git commented Dec 27, 2018

To solve:

Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:791): sockets: The current kernel doesn't support ipv4 raw_diag module

Try to run:

$ sudo modprobe raw_diag

The error (from the dump.log file) which is causing the failure is

Error (criu/files-reg.c:854): Can't dump ghost file /dev/shm/.com.google.Chrome.U6lkKv of 4198400 size, increase limit

With CRIU version 3.11 you should be able to solve it with:

# echo "ghost-limit 100M" >> /etc/criu/default.conf

However, Google Chrome is a GUI application. In order to checkpoint/restore such application you will need to run it inside a VNC environment. You can find more information on how to do that in https://criu.org/VNC

@wangzhao123456
Copy link
Author

To solve:

Error (criu/libnetlink.c:27): ERROR -2 reported by netlink (No such file or directory)
Warn (criu/sockets.c:791): sockets: The current kernel doesn't support ipv4 raw_diag module

Try to run:

$ sudo modprobe raw_diag

The error (from the dump.log file) which is causing the failure is

Error (criu/files-reg.c:854): Can't dump ghost file /dev/shm/.com.google.Chrome.U6lkKv of 4198400 size, increase limit

With CRIU version 3.11 you should be able to solve it with:

# echo "ghost-limit 100M" >> /etc/criu/default.conf

However, Google Chrome is a GUI application. In order to checkpoint/restore such application you will need to run it inside a VNC environment. You can find more information on how to do that in https://criu.org/VNC

Thanks for your reply.
When running command "sudo modprobe raw_diag", I got the following error:

modprobe: FATAL: Module raw_diag not found.

and I didn't find raw_diag module in /var/lib/modules/$(uname -r).

There are no criu directory in /etc/,when I run "echo "ghost-limit 100M" >> /etc/criu/default.conf", I got

base: /etc/criu/default.conf: No such file or directory.

Thanks again.

@adrianreber
Copy link
Member

You can easily create the directory with mkdir /etc/criu. But the main problem will be that it is a GUI application.

@wangzhao123456
Copy link
Author

You can easily create the directory with mkdir /etc/criu. But the main problem will be that it is a GUI application.

Thanks for your help!

@expresspotato
Copy link

Also having this issue, but criu error's out on trying to dump the pulseaudio fd.

[root@localhost ~]# criu check --all
Error (criu/cr-check.c:626): Kernel doesn't support PTRACE_O_SUSPEND_SECCOMP
Error (criu/cr-check.c:670): Dumping seccomp filters not supported: Input/output error
Error (criu/cr-check.c:889): cgroupns not supported. This is not fatal.
Warn  (criu/cr-check.c:1061): Do not have API to map vDSO - will use mremap() to restore vDSO
Warn  (criu/net.c:2770): Unable to get socket network namespace
Warn  (criu/cr-check.c:1027): compat_cr is not supported. Requires kernel >= v4.12
Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.

Then

[root@localhost ~]# criu dump --ghost-limit=1G --tcp-established -D dump --shell-job -t 98601
Warn  (compel/arch/x86/src/lib/infect.c:273): Will restore 98601 with interrupted system call
Error (criu/files-reg.c:1282): Can't lookup mount=3 for fd=156 path=/memfd:pulseaudio (deleted)
Error (criu/cr-dump.c:1353): Dump files (pid: 98601) failed with -1
Error (criu/cr-dump.c:1707): Dumping FAILED.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants