Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement containerd API for checkpoints #22049

Merged
merged 5 commits into from Sep 9, 2016

Conversation

@boucher
Copy link
Contributor

boucher commented Apr 14, 2016

An implementation of the containerd checkpoint API for docker. For a detailed description, see the included documentation at experimental/checkpoint-restore.md.

(ping #20300)

@boucher

This comment has been minimized.

Copy link
Contributor Author

boucher commented Apr 14, 2016

One of the problems I see with this API is that in containerd, the --checkpoint argument is on start, but here it has to be on create, and at that time the directory won't even exist. The workflow would have to be: 1) create a container, 2) copy a checkpoint into its directory, 3) start the container. It's also not clear how restarts should work.

@SaiedKazemi

This comment has been minimized.

Copy link

SaiedKazemi commented Apr 22, 2016

@boucher @crosbymichael Any updates on this?

Exit: exit,
Tcp: true,
UnixSockets: true,
Shell: true,

This comment has been minimized.

Copy link
@xemul

xemul Apr 25, 2016

This Shell: true makes me a little bit nervous. Does container share hist sid or terminal with someone else?

@xemul

This comment has been minimized.

Copy link

xemul commented Apr 25, 2016

@boucher have you had chance to look at CI failures? Is there anything on the CRIU side that should be done?

@boucher

This comment has been minimized.

Copy link
Contributor Author

boucher commented Apr 25, 2016

This depends on an updated vendor library not included in the patch, so it
won't even compile until that is merged in. I have another branch with the
updated vendor if you want to try it.
On Mon, Apr 25, 2016 at 4:54 AM Pavel Emelyanov notifications@github.com
wrote:

@boucher https://github.com/boucher have you had chance to look at CI
failures? Is there anything on the CRIU side that should be done?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#22049 (comment)

@vdemeester

This comment has been minimized.

Copy link
Member

vdemeester commented Apr 25, 2016

@boucher you can fake the vendoring for the time being though, using the following inside ./hack/vendor.sh :

clone git github.com/docker/engine-api 09758b212deda0c4c0bce5571db73f9b7322a7cc https://github.com/boucher/engine-api

This way it will compile and we will be able to see how the CI is doing with the changes 😉

@boucher

This comment has been minimized.

Copy link
Contributor Author

boucher commented Apr 25, 2016

@vdemeester good tip, thanks.

@boucher boucher force-pushed the boucher:docker-checkpoint-restore branch 4 times, most recently from 8398ea6 to 3da667a Apr 25, 2016

@boucher

This comment has been minimized.

Copy link
Contributor Author

boucher commented Apr 25, 2016

@vdemeester from these failures, it seems like the vendor change isn't being reflected?

@vdemeester

This comment has been minimized.

Copy link
Member

vdemeester commented Apr 25, 2016

@boucher ah yes, you have to run ./hack/vendor.sh locally and then commit the changes (hopefully it will only update engine-api changes)

@boucher

This comment has been minimized.

Copy link
Contributor Author

boucher commented Apr 25, 2016

Fair enough. Is it preferable to have those changes in the PR and back them out later?

@vdemeester

This comment has been minimized.

Copy link
Member

vdemeester commented Apr 25, 2016

@boucher most of the time we do a commit for the vendoring (temporary vendoring or not) and another one with the change. It's easier to follow/review and also to interactively rebase when changing the vendoring commit hash for example 👼

@boucher boucher force-pushed the boucher:docker-checkpoint-restore branch 4 times, most recently from a23facd to 5126a8e Apr 25, 2016

@boucher boucher force-pushed the boucher:docker-checkpoint-restore branch 7 times, most recently from 6e82bad to 6ad2ad8 May 3, 2016

boucher added some commits Aug 30, 2016

Fix typo
Signed-off-by: boucher <rboucher@gmail.com>
Fix the clashing route syntax for checkpoint/container delete.
Signed-off-by: boucher <rboucher@gmail.com>
Update containerd to fix unkillable restored containers.
Signed-off-by: boucher <rboucher@gmail.com>

@boucher boucher force-pushed the boucher:docker-checkpoint-restore branch from 3e8d5dc to 6bc9a2d Sep 9, 2016

@hkvemuri

This comment has been minimized.

Copy link

hkvemuri commented Sep 9, 2016

Looking forward to the changes in the pull request to start exploring the use-cases that it can open.

One thought/suggestion I have is we could perhaps introduce callbacks into the container for each of these actions performed from the docker CLI / API

  • docker start
  • docker stop
  • docker checkpoint
  • docker restore

These could as simple as the entry point executable specified in the compose file. This can aid in the application within the container do some pre/post processing before the docker management layers perform operations on the container. For e.g. prior to a checkpoint the application can perhaps do a sync of its data structures to persistent storage.

Appreciate your comments/views

"github.com/docker/docker/cli/command"
)

// NewCheckpointCommand returns a cobra command for `checkpoint` subcommands

This comment has been minimized.

Copy link
@mlaventure

mlaventure Sep 9, 2016

Contributor

It doesn't return the command but updates the one passed in (i.e. rootCmd)

Update checkpoint comments to be more accurate
Signed-off-by: boucher <rboucher@gmail.com>
@mlaventure

This comment has been minimized.

Copy link
Contributor

mlaventure commented Sep 9, 2016

LGTM (if 💚)

@estesp

This comment has been minimized.

Copy link
Contributor

estesp commented Sep 9, 2016

@mlaventure I'll be happy to work with @boucher to make sure the metadata location change gets in prior to 1.13 freeze so there is stability there for users of the experimental build. Will open an issue so we can track it against 1.13.

@mlaventure

This comment has been minimized.

Copy link
Contributor

mlaventure commented Sep 9, 2016

@estesp thanks, much appreciated!

@hkvemuri your idea should probably become a feature request issue once this PR is merged too.

@estesp

This comment has been minimized.

Copy link
Contributor

estesp commented Sep 9, 2016

LGTM

@estesp estesp merged commit cf58eb4 into moby:master Sep 9, 2016

8 checks passed

docker/dco-signed All commits signed
Details
documentation success
Details
experimental Jenkins build Docker-PRs-experimental 23532 has succeeded
Details
janky Jenkins build Docker-PRs 32113 has succeeded
Details
userns Jenkins build Docker-PRs-userns 14135 has succeeded
Details
vendor Jenkins build Docker-PRs-vendor 1926 has succeeded
Details
win2lin Jenkins build Docker-PRs-Win2Lin 30837 has succeeded
Details
windowsRS1 Jenkins build Docker-PRs-WoW-RS1 2917 has succeeded
Details
@estesp

This comment has been minimized.

Copy link
Contributor

estesp commented Sep 9, 2016

congrats @boucher on your patience and commitment to get this through

@boucher

This comment has been minimized.

Copy link
Contributor Author

boucher commented Sep 9, 2016

Thanks for all the help everyone!

return os.RemoveAll(filepath.Join(checkpointDir, checkpoint))
}

// CheckpointList deletes the specified checkpoint

This comment has been minimized.

Copy link
@crisish

crisish Nov 28, 2016

spelling error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.