New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[17.06] Graceful upgrade of containerd and runc state files upon live-restore #117

Merged
merged 1 commit into from Jul 24, 2017

Conversation

Projects
None yet
6 participants
@tiborvass
Contributor

tiborvass commented Jul 13, 2017

To address issue:

@andrewhsu andrewhsu added this to the 17.06.1 milestone Jul 13, 2017

Show outdated Hide outdated components/engine/libcontainerd/client_linux.go Outdated
Show outdated Hide outdated components/engine/libcontainerd/remote_unix.go Outdated
@andrewhsu

This comment has been minimized.

Show comment
Hide comment
@andrewhsu

andrewhsu Jul 13, 2017

Collaborator

In latest build result:

00:27:05.307 FAIL: docker_cli_daemon_test.go:2813: DockerDaemonSuite.TestExecWithUserAfterLiveRestore
00:27:05.307 
00:27:05.307 [dd68c05e3b6f9] waiting for daemon to start
00:27:05.307 [dd68c05e3b6f9] daemon started
00:27:05.307 
00:27:05.307 [dd68c05e3b6f9] exiting daemon
00:27:05.307 [dd68c05e3b6f9] waiting for daemon to start
00:27:05.307 [dd68c05e3b6f9] daemon started
00:27:05.307 
00:27:05.307 docker_cli_daemon_test.go:2834:
00:27:05.307     c.Assert(err, check.IsNil, check.Commentf("Output: %s", out2))
00:27:05.307 ... value *exec.ExitError = &exec.ExitError{ProcessState:(*os.ProcessState)(0xc42054f1e0), Stderr:[]uint8(nil)} ("exit status 126")
00:27:05.307 ... Output: invalid character 's' after top-level value
00:27:05.307 
00:27:05.307 
00:27:05.307 [dd68c05e3b6f9] exiting daemon
Collaborator

andrewhsu commented Jul 13, 2017

In latest build result:

00:27:05.307 FAIL: docker_cli_daemon_test.go:2813: DockerDaemonSuite.TestExecWithUserAfterLiveRestore
00:27:05.307 
00:27:05.307 [dd68c05e3b6f9] waiting for daemon to start
00:27:05.307 [dd68c05e3b6f9] daemon started
00:27:05.307 
00:27:05.307 [dd68c05e3b6f9] exiting daemon
00:27:05.307 [dd68c05e3b6f9] waiting for daemon to start
00:27:05.307 [dd68c05e3b6f9] daemon started
00:27:05.307 
00:27:05.307 docker_cli_daemon_test.go:2834:
00:27:05.307     c.Assert(err, check.IsNil, check.Commentf("Output: %s", out2))
00:27:05.307 ... value *exec.ExitError = &exec.ExitError{ProcessState:(*os.ProcessState)(0xc42054f1e0), Stderr:[]uint8(nil)} ("exit status 126")
00:27:05.307 ... Output: invalid character 's' after top-level value
00:27:05.307 
00:27:05.307 
00:27:05.307 [dd68c05e3b6f9] exiting daemon
@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass
Contributor

tiborvass commented Jul 13, 2017

MountLabel string `json:"mount_label"`
Hostname string `json:"hostname"`
Namespaces configs.Namespaces `json:"namespaces"`
Capabilities linuxCapabilities `json:"capabilities"`

This comment has been minimized.

@tiborvass

tiborvass Jul 13, 2017

Contributor

@mlaventure just realized that linuxCapabilities uses specs.LinuxCapabilities (https://github.com/docker/docker-ce/pull/117/files#diff-40363fc1cc8aab85c96e0f6d4e7d4315R97) while this is supposed to be configs.Capabilities, which thankfully happens to be the same.

Maybe it's fine for this version and we can think of something more accurate after?

@tiborvass

tiborvass Jul 13, 2017

Contributor

@mlaventure just realized that linuxCapabilities uses specs.LinuxCapabilities (https://github.com/docker/docker-ce/pull/117/files#diff-40363fc1cc8aab85c96e0f6d4e7d4315R97) while this is supposed to be configs.Capabilities, which thankfully happens to be the same.

Maybe it's fine for this version and we can think of something more accurate after?

if err != nil {
return err
}
// TODO: copy caps or not copy caps?

This comment has been minimized.

@tiborvass

tiborvass Jul 13, 2017

Contributor

@mlaventure any thoughts on this?

@tiborvass

tiborvass Jul 13, 2017

Contributor

@mlaventure any thoughts on this?

This comment has been minimized.

@mlaventure

mlaventure Jul 14, 2017

Contributor

wdym by copy?

@mlaventure

mlaventure Jul 14, 2017

Contributor

wdym by copy?

This comment has been minimized.

@tiborvass

tiborvass Jul 17, 2017

Contributor

@mlaventure I mean:

copy(boundingCaps, caps)
copy(effectiveCaps, caps)
...

Otherwise, l.V.Bounding[i] == l.V.Effective[i], and if someone writes code to change an element of Bounding it will change all the other ones too, unwanted side-effect. Although it's only in very special cases, where the same underlying array is used.

@tiborvass

tiborvass Jul 17, 2017

Contributor

@mlaventure I mean:

copy(boundingCaps, caps)
copy(effectiveCaps, caps)
...

Otherwise, l.V.Bounding[i] == l.V.Effective[i], and if someone writes code to change an element of Bounding it will change all the other ones too, unwanted side-effect. Although it's only in very special cases, where the same underlying array is used.

This comment has been minimized.

@mlaventure

mlaventure Jul 18, 2017

Contributor

It's the case currently in the daemon code too: https://github.com/moby/moby/blob/master/daemon/oci_linux.go#L258-L266

@mlaventure

mlaventure Jul 18, 2017

Contributor

It's the case currently in the daemon code too: https://github.com/moby/moby/blob/master/daemon/oci_linux.go#L258-L266

This comment has been minimized.

@tiborvass

tiborvass Jul 18, 2017

Contributor

@mlaventure okay thanks, so let's keep it orthogonal.

@tiborvass

tiborvass Jul 18, 2017

Contributor

@mlaventure okay thanks, so let's keep it orthogonal.

@andrewhsu

This comment has been minimized.

Show comment
Hide comment
@andrewhsu
Collaborator

andrewhsu commented Jul 13, 2017

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Jul 18, 2017

Contributor

@mlaventure I pushed a better version, that does all the decoding first, before overwriting the files. This is so that if one of the later files fails to be decoded, the previous ones keep their initial state and we don't end up in a mixed state.

Contributor

tiborvass commented Jul 18, 2017

@mlaventure I pushed a better version, that does all the decoding first, before overwriting the files. This is so that if one of the later files fails to be decoded, the previous ones keep their initial state and we don't end up in a mixed state.

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Jul 18, 2017

Contributor

And now it also encodes into a bytes.Buffer before writing to file, just in case the reencoding phase fails (not just the decoding).

I hope the extra buffering doesn't become an issue when there's a lot of containers.

Contributor

tiborvass commented Jul 18, 2017

And now it also encodes into a bytes.Buffer before writing to file, just in case the reencoding phase fails (not just the decoding).

I hope the extra buffering doesn't become an issue when there's a lot of containers.

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass
Contributor

tiborvass commented Jul 21, 2017

@stevvooe

This comment has been minimized.

Show comment
Hide comment
@stevvooe

stevvooe Jul 21, 2017

Contributor

LGTM after moving to atomic file write.

Contributor

stevvooe commented Jul 21, 2017

LGTM after moving to atomic file write.

@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Jul 21, 2017

Contributor

I agree with @stevvooe it's better to write to a tmp file, then rename it.

Contributor

mlaventure commented Jul 21, 2017

I agree with @stevvooe it's better to write to a tmp file, then rename it.

[engine] Graceful upgrade of containerd and runc state files upon liv…
…e-restore

Vendors new dependency github.com/crosbymichael/upgrade

Signed-off-by: Tibor Vass <tibor@docker.com>
@andrewhsu

This comment has been minimized.

Show comment
Hide comment
@andrewhsu

andrewhsu Jul 22, 2017

Collaborator

LGTM

I've tested this PR by building a deb package and upgrading from a docker-ce 17.03 daemon that had a container started with live-restore. After the upgrade, container still seen as active from docker ps.

Collaborator

andrewhsu commented Jul 22, 2017

LGTM

I've tested this PR by building a deb package and upgrading from a docker-ce 17.03 daemon that had a container started with live-restore. After the upgrade, container still seen as active from docker ps.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael
Member

crosbymichael commented Jul 24, 2017

LGTM

defer f.w.Close()
}
var errs []string
for _, f := range files {

This comment has been minimized.

@stevvooe

stevvooe Jul 24, 2017

Contributor

Ok, so this sucks everything up into memory, creates an atomic file for each, then writes the buffer.

@stevvooe

stevvooe Jul 24, 2017

Contributor

Ok, so this sucks everything up into memory, creates an atomic file for each, then writes the buffer.

@stevvooe

This comment has been minimized.

Show comment
Hide comment
@stevvooe

stevvooe Jul 24, 2017

Contributor

LGTM

Contributor

stevvooe commented Jul 24, 2017

LGTM

@vieux vieux merged commit 8bdf679 into docker:17.06 Jul 24, 2017

3 checks passed

ce-tests Jenkins build docker-ce-pr 10 has succeeded
Details
ce-tests-WoW-RS1 Jenkins build docker-ce-17.06-pr-WoW-RS1 200 has succeeded
Details
dco-signed All commits are signed
@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Jul 24, 2017

Contributor

@vieux Line for changelog: Fix issue upon upgrade, preventing docker from showing running containers when --live-restore is enabled

Contributor

tiborvass commented Jul 24, 2017

@vieux Line for changelog: Fix issue upon upgrade, preventing docker from showing running containers when --live-restore is enabled

@andrewhsu

This comment has been minimized.

Show comment
Hide comment
@andrewhsu

andrewhsu Jul 24, 2017

Collaborator

@tiborvass @vieux I've also verified this PR works on upgrade from docker-ce 17.05.0.

Collaborator

andrewhsu commented Jul 24, 2017

@tiborvass @vieux I've also verified this PR works on upgrade from docker-ce 17.05.0.

@andrewhsu

This comment has been minimized.

Show comment
Hide comment
@andrewhsu

andrewhsu Jul 24, 2017

Collaborator

Note to selves: after conversation with @tiborvass may need to look into having all 3 json files updated atomically. As this PR stands, looks good to have containers still running with live restore.

Collaborator

andrewhsu commented Jul 24, 2017

Note to selves: after conversation with @tiborvass may need to look into having all 3 json files updated atomically. As this PR stands, looks good to have containers still running with live restore.

@vieux vieux changed the title from [engine] Graceful upgrade of containerd and runc state files upon live-restore to [17.06] Graceful upgrade of containerd and runc state files upon live-restore Jul 24, 2017

docker-jenkins pushed a commit that referenced this pull request Jul 3, 2018

Merge pull request #117 from andrewhsu/p
[18.06] update package descriptions and info
Upstream-commit: 2baa88e2bcf7c451ea8405fa54948c07f722cf9e
Component: packaging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment