-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Fix checkpoint's exiting semantics. #37360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
janky seems to be failing for no good reason?
|
friendly ping |
cc @cpuguy83 maybe? |
Can we add an integration test for this? |
@cpuguy83 I'd like to, but we've actually had trouble getting this to work with runc (we're using runsc instead). Are there existing integration tests somewhere that we can build off of? |
Is this the error?
|
ping @kolyshkin |
Actually, there is one more problem. Docker (containerd) has to call "runc checkpoint" with "--empty-ns network" to dump a docker container. I have a patch which fixes both these issues: avagin/docker-ce@911c89f#diff-e1f8834158a9117cafe1744dc2c7adb2 But it depends on containerd changes: |
I like @avagin approach better (i.e. appending to options if necessary). |
Sure, we can append to options. Having to call with an empty NS is specific to runc. runsc does not need to be called with that. Can this be part of something runc-specific, or can this be done in an API-agnostic way? |
@hugelgupf Here are changes which resolve the issue with an empty NS: |
41938ff
to
c9c0ef9
Compare
Codecov Report
@@ Coverage Diff @@
## master #37360 +/- ##
==========================================
- Coverage 34.95% 34.92% -0.03%
==========================================
Files 610 610
Lines 44886 44873 -13
==========================================
- Hits 15690 15673 -17
- Misses 27077 27081 +4
Partials 2119 2119 |
@kolyshkin @avagin cpuguy wanted an integration test, but it seems like there are none for moby checkpoint/restore already? Is there anything y'all have we can contribute to? |
Ping |
LGTM |
Now that it's lgtm'd, how do I get this merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐸
cc @mlaventure @thaJeztah
Previously, dockerd would always ask containerd to pass --leave-running to runc/runsc, ignoring the exit boolean value. Hence, even `docker checkpoint create --leave-running=false ...` would not stop the container. Signed-off-by: Brielle Broder <bbroder@google.com>
c9c0ef9
to
db621eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
z failure is unrelated
@bjbroder This restores the behavior of making --leave-running configurable, but is the default still true? The default used to be false. |
@tswift242 The default for --leave-running was, and still is, false. Previously, both the default (false) and an explicit set of the flag to false would cause the process to incorrectly continue running after checkpointing because "exit" had not been implemented appropriately, even though the leave-running flag is correctly set. The leave-running flag's behavior can be seen in: |
Gotcha. Thanks for the reply and the fix to this issue! |
Previously, dockerd would always ask containerd to pass --leave-running
to runc/runsc, ignoring the exit boolean value. Hence, even
docker checkpoint create --leave-running=false ...
would not stop thecontainer.
Signed-off-by: Brielle Broder bbroder@google.com