Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.13 experimental: checkpoint/restore not working #1059

Open
thaJeztah opened this issue Dec 20, 2016 · 41 comments
Open

1.13 experimental: checkpoint/restore not working #1059

thaJeztah opened this issue Dec 20, 2016 · 41 comments

Comments

@thaJeztah
Copy link
Member

Docker 1.13 adds experimental support for checkpoint/restore, using CRIU (https://criu.org), see Docker Checkpoint & Restore.

This feature currently cannot be used on the Docker for Mac "beta" / "master" channel, because CRIU is not available in the VM.

Expected behavior

$ docker run --security-opt=seccomp:unconfined --name cr -d busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

$ docker checkpoint create cr checkpoint1

Actual behavior

$ docker run --security-opt=seccomp:unconfined --name cr -d busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'

$ docker checkpoint create cr checkpoint1
Error response from daemon: Cannot checkpoint container cr: rpc error: code = 2 desc = exit status 1: "Unable to execute CRIU command: criu\n"

Information

Version 1.13.0-rc4-beta34 (14831)
Channel: Master
5987079516

Steps to reproduce the behavior

See above

CRIU should be available on Docker for Mac / Docker for Windows, or a note should be added to the "known issues" section
https://docs.docker.com/docker-for-mac/troubleshoot/#/known-issues

I opened an internal issue for this as well

/cc @justincormack @londoncalling

@smakam
Copy link

smakam commented Jan 24, 2017

Hi
I am hitting the same issue with Docker 1.13 experimental with criu version v2.10-1-gd9486bd.
Any solution for this?
I am using Ubuntu 16 VM to try this out.

Thanks
Sreenivas

@thaJeztah
Copy link
Member Author

@smakam if you're getting the same error as above, then criu is not installed, however, the issue here is purely for Docker for Mac, which currently doesn't have support fro criu, so not directly related to your issue

@smakam
Copy link

smakam commented Jan 25, 2017

@thaJeztah Thanks for the response. Sorry, it was a mistake on my part. I had a criu installed, but I had not done "make install" that was causing the issue. After doing "make install", its working fine now.

@boucher
Copy link

boucher commented Jan 26, 2017

I was able to get this working this morning. There's a pre-built version of criu in the alpine testing repository, but it doesn't quite work. I've attached an updated binary that should work if put into the docker for mac vm: criu-alpine.zip

Here's the other libraries you'll need installed:

/ # ldd /usr/sbin/criu
        /lib/ld-musl-x86_64.so.1 (0x56394a323000)
        libprotobuf-c.so.1 => /usr/lib/libprotobuf-c.so.1 (0x7f1c09b22000)
        libnl-3.so.200 => /usr/lib/libnl-3.so.200 (0x7f1c09904000)
        libnet.so.1 => /usr/lib/libnet.so.1 (0x7f1c096ec000)
        libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x56394a323000)

The other thing I had to do to get it working was upgrade the version of tar, with:

apk --update add tar

It would be great to get this built into docker for mac and have checkpoint/restore work out of the box!

@justincormack
Copy link
Member

@boucher can you submit a patch to Alpine to fix the upstream? I can't use a binary.

@avagin
Copy link

avagin commented Jan 30, 2017

I pushed all required patches into the alpine branch:
https://github.com/avagin/criu/commits/alpine

And I think it would be better to wait two weeks to the next CRIU release (2.11).

@justincormack
Copy link
Member

Thanks.

@boucher
Copy link

boucher commented Feb 13, 2017

@avagin Looks like 2.11 just came out? Is the next step to get that added to the alpine testing repo?

@justincormack
Copy link
Member

cc @ncopa

@avagin
Copy link

avagin commented Feb 15, 2017

@boucher I suggest to get 2.11 directly, it has to work without any additional changes.
xemul/criu@8719b7c

@boucher
Copy link

boucher commented Feb 15, 2017

@justincormack Are you able to build packages for including in d4m, or do you need it to be in a repository?

@boucher
Copy link

boucher commented Feb 23, 2017

Any updates? Would be great to get this working and seems like we're pretty close.

@junior
Copy link

junior commented Apr 25, 2017

Any updates?

@matti
Copy link

matti commented Jun 5, 2017

Is this going to be included or not?

@boucher
Copy link

boucher commented Aug 31, 2017

Quick update: I've built a docker image that should install a working version of CRIU into the docker-for-mac virtual machine.

https://hub.docker.com/r/boucher/criu-for-mac/

@tsmgodoi
Copy link

@boucher Do I have to run this inside the VM or can I run it from the usual docker client?

@boucher
Copy link

boucher commented Sep 25, 2017

@tsmgodoi You run it from your mac (the normal docker client): docker run --rm -it --privileged --pid=host boucher/criu-for-mac

@tsmgodoi
Copy link

tsmgodoi commented Sep 25, 2017

I see. Sorry for the off-topic, but do you have any clue on how to run this on boot2docker on Windows?

@boucher
Copy link

boucher commented Sep 25, 2017

Have you tried just running the same command?

@boucher
Copy link

boucher commented Sep 25, 2017

In any case, the actual commands that need to be run on the VM are shown in the dockerfile for this image: https://github.com/boucher/criu-for-mac/blob/master/Dockerfile

If you can access the VM you should be able to run those same commands directly to end up with a working CRIU install.

@tsmgodoi
Copy link

I've tried running the image but it doesn't work. It says: "sh: criu: not found
sh: apk: not found" I'll try running the commands directly on the VM.

@bitmensch
Copy link

/remove-lifecycle stale

Unfortunately checkpoint/restore is still not working on Docker for Mac. Is there any ETA when this will be made officially available?

Unfortunately also the workaround suggested by @boucher is not working anymore due to the use of the read-only linuxkit filesystem within the Docker VM.

Are there any other known workarounds to get CRIU to work with the latest Docker for Mac versions?

@maggesi
Copy link

maggesi commented Apr 30, 2018

/remove-lifecycle stale
I also hope to see this problem solved in the near future.

@docker-robott
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@BEllis
Copy link

BEllis commented Sep 13, 2018

/remove-lifecycle stale

Can we reopen this issue? It's still a problem?

@BEllis
Copy link

BEllis commented Sep 13, 2018

Thanks. It sounds like the issue may be upstream (alpine?) but if we could keep the issue open to follow up and chase it would be great.

@guillaumerose
Copy link
Contributor

/lifecycle frozen

@arashd
Copy link

arashd commented Jan 14, 2019

Hitting the same issue. I brought up the issue on the linuxkit repo, but this is a more appropriate place to mention it.

I've also been trying to get the checkpoint/restore experimental feature of docker to work on a mac. After turning on the experimental feature, I see:

$ docker checkpoint create 53fc5dcc6fc9 checkpoint1
Error response from daemon: Cannot checkpoint container 53fc5dcc6fc9: runc did not terminate sucessfully: CRIU version check failed: exec: "criu": executable file not found in $PATH path= /var/run/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/53fc5dcc6fc9376f1cf067015750ce67cfbac1863a86b73593cc3dbab974223f/criu-dump.log: unknown

criu doesn't exist on the vm that docker for mac uses by default. I am pretty sure I need to install CRIU on the d4m linux vm. I attempted to use this approach (https://github.com/boucher/criu-for-mac), but realized it doesn’t work since docker for mac, in its newer versions, uses a .iso file built with Linuxkit for its vm, and the image has a read-only filesystem.

$ docker run --rm -it --privileged --pid=host boucher/criu-for-mac
sh: criu: not found
ERROR: Unable to lock database: Read-only file system
ERROR: Failed to open apk database: Read-only file system

The closest I could get was this post, and repo which attempts to pull out some of the missing pieces out of the existing docker for mac image:

Is there a better way of building criu into the docker-for-mac vm when building with linuxkit?

Would appreciate any help.

@nilols
Copy link

nilols commented Jan 28, 2019

@arashd I was curious if you have made any progress on this?

I guess the only real solution to this is to have criu pre installed in the docker vm image

But I managed to add criu to the docker vm by first install criu in another alpine container (that allows apk) then move criu and shared libs into the docker vm with docker cp and wget, after mount --bind the folders that contains criu and libs to existing empty folders, already in the path, I could run criu, but when trying to create a checkpoint through docker it still complained that criu was not on the path..

@rnorth
Copy link

rnorth commented Mar 22, 2019

I guess the only real solution to this is to have criu pre installed in the docker vm image

Having spent a lot of time trying to get CRIU to work, I'd agree that this is the only sustainable solution. It is rather disheartening that Docker don't seem to be giving this any attention. In my view CRIU, as a widely available feature, could be extremely useful - another game-changer. As it is, without support on dev machines, it's not practical to use at all, and this is sad.

@Mihirmathur
Copy link

I'm running into this same problem. What's the easiest way to pre-install CRIU in the docker image?

@Jamie5
Copy link

Jamie5 commented Jun 9, 2020

@thaJeztah is the tagged status correct, that "The issue has been assigned to a engineer and is waiting a fix"?

@toonvanstrijp
Copy link

@thaJeztah any update on this? :)

@alanruttenberg
Copy link

@thaJeztah I'd like to convey my strong interest in having this implemented in Docker. It would enable me to distribute a set of java-based tools that are currently impractical due to start up time. I've been following this for years.

@alongouldman
Copy link

I wish this would work!

@asterbini
Copy link

Any chance we will get it?

@alanruttenberg
Copy link

@asterbini it works in podman now

@turadg
Copy link

turadg commented Aug 22, 2023

@alanruttenberg have you tried it Podman on Mac? The sudo podman container checkpoint command gives me an error. I think because CRIU isn't on the host

@tzvetkovg
Copy link

any progress on this, still doesn't work on macos?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests