Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support C/R of frozen cgroup #20

Closed
klesgidisold opened this issue Aug 14, 2015 · 10 comments
Closed

Support C/R of frozen cgroup #20

klesgidisold opened this issue Aug 14, 2015 · 10 comments

Comments

@klesgidisold
Copy link

If I pause a container

docker pause <container_name>

and then try to checkpoint it

docker checkpoint <container_name>

checkpoint stalls...
From another terminal I am trying to unpause

docker unpause <container_name>

and it also stalls.

@xemul
Copy link
Member

xemul commented Aug 14, 2015

Seem to be Docker issue. @SaiedKazemi , @boucher would you please help here?

@SaiedKazemi
Copy link
Contributor

Which Docker version are you using? Also, what is the container you're running? Can you send us the exact commands on how you started your server and the containers?

@klesgidisold
Copy link
Author

Docker

Client:
 Version:      1.8.0-dev
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   c74defa-dirty
 Built:        Wed Aug  5 23:15:11 UTC 2015
 OS/Arch:      linux/amd64
 Experimental: true

Server:
 Version:      1.8.0-dev
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   c74defa-dirty
 Built:        Wed Aug  5 23:15:11 UTC 2015
 OS/Arch:      linux/amd64
 Experimental: true

I tested in a MySQL and a Postgre SQL (both official) containers. I run:

docker run -d --name <container_name> mysql:5.7

and then the commands that I specified on the first post.

I didn't do anything to start the docker server.
sudo service docker start (I'm not sure if you asking something else about it)

Thanks for the response..

@boucher
Copy link

boucher commented Aug 14, 2015

Have you gotten checkpoint/restore to work at all on your system?

Out of curiosity, can I ask what the use case is for pausing the container before checkpointing it?

Also, could you try updating your build of Docker? There have been changes in my checkpoint/restore branch since the git hash shown in this version.

@klesgidisold
Copy link
Author

Yes C/R is working on my system.

There isn't any use case. I was just messing around and noticed this attitude.. I decide to say it here if you haven't noticed it yet.

I am updating my current build of docker right now and I 'll let you know..

Thanks

@boucher
Copy link

boucher commented Aug 14, 2015

Ok, thanks for letting us know. I don't actually know much about how docker
pause works, so it will require some looking into.

On Fri, Aug 14, 2015 at 9:20 AM, Kyriakos Lesgidis <notifications@github.com

wrote:

Yes C/R is working on my system.

There isn't any use case. I was just messing around and noticed this
attitude.. I decide to say it here if you haven't noticed it yet.

I am updating my current build of docker right now and I 'll let you know..

Thanks


Reply to this email directly or view it on GitHub
https://github.com/xemul/criu/issues/20#issuecomment-131167307.

@SaiedKazemi
Copy link
Contributor

@xemul @klesgidis
I just tested pausing (freezing) a shell process and then checkpointing it (i.e., no Docker) and CRIU hung. Here is what I did:

[Terminal A]
/bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 3; done'

[Terminal B]
mkdir /sys/fs/cgroup/freezer/0
ps -efl | grep -w sh
echo 7042 > /sys/fs/cgroup/freezer/0/tasks
echo FROZEN > /sys/fs/cgroup/freezer/0/freezer.state
criu dump -v4 -D /tmp -o dump.log -j -t 7042

[Terminal C]
ps -efl | grep criu
strace -p 6449
Process 6449 attached
wait4(6355,

Unless I am missing something, it seems to be a CRIU issue. Any idea?

@avagin
Copy link
Member

avagin commented Aug 15, 2015

CRIU can't attach by ptrace() to frozen processes.
Recently, I added the freeze-cgroup option, it should be used for paused containers.

@xemul
Copy link
Member

xemul commented Aug 15, 2015

@avagin CRIU can attach with ptrace, this is what we do in your patches :) CRIU cannot run parasite in them, but this can be addressed by extending the freeze-seize-unfreeze-run logic you've introduced.

@xemul xemul changed the title Docker - Pause and Checkpoint causes deadlock Support C/R of frozen cgroup Aug 17, 2015
wtf42 added a commit to wtf42/criu that referenced this issue Oct 26, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.

We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
wont be missed.

Signed-off-by: Evgeniy Akimov <geka666 at gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 14, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 15, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 22, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 23, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Patch restores state only for one freezer cgroup from --freeze-cgroup option,
not all states from whole hierarchy, because CRIU supports checkpoint from
freezer cgroup hierarchy only with THAWED state, except root cgroup from
--freeze-cgroup option.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 26, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Patch restores state only for one freezer cgroup from --freeze-cgroup option,
not all states from whole hierarchy, because CRIU supports checkpoint from
freezer cgroup hierarchy only with THAWED state, except root cgroup from
--freeze-cgroup option.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 26, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Patch restores state only for one freezer cgroup from --freeze-cgroup option,
not all states from whole hierarchy, because CRIU supports checkpoint from
freezer cgroup hierarchy only with THAWED state, except root cgroup from
--freeze-cgroup option.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
wtf42 added a commit to wtf42/criu that referenced this issue Nov 26, 2015
Issue checkpoint-restore#20. (Support C/R of frozen cgroup)

Patch restores freezer cgroup state between finalize_restore stages.
It should be done after first stage because we cannot unmap restorer blob
from frozen process, and before second stage because we must freeze processes
before they continue run.
We also need to move fini_cgroup between these stages to provide freezer
cgroup state restorer access to cgroup mount directories.
Error handlers contains fini_cgroup, so we are sure that fini_cgroup call
won't be missed.

Patch restores state only for one freezer cgroup from --freeze-cgroup option,
not all states from whole hierarchy, because CRIU supports checkpoint from
freezer cgroup hierarchy only with THAWED state, except root cgroup from
--freeze-cgroup option.

Signed-off-by: Evgeniy Akimov <geka666@gmail.com>
Signed-off-by: Eugene Batalov <eabatalov89@gmail.com>
@xemul
Copy link
Member

xemul commented Dec 16, 2015

8b04551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants