New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase container default shutdown timeout on Windows #35184

Merged
merged 1 commit into from Oct 23, 2017

Conversation

Projects
None yet
7 participants
@darstahl
Contributor

darstahl commented Oct 12, 2017

The shutdown timeout for containers in insufficient on Windows. If the daemon is shutting down, and a container takes longer than expected to shut down, this can cause the container to remain in a bad state after restart, and never be able to start again. Increasing the timeout makes this less likely to occur.

This helps reduce the likelihood of #29392 happening while waiting for containerd 1.0 integration, which changes the model here enough that the real fix will be in containerd.

Signed-off-by: Darren Stahl darst@microsoft.com

@jhowardmsft

LGTM

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Oct 13, 2017

Member

We may have to check the documentation, as I think it mentions the default 10 seconds in some places

Member

thaJeztah commented Oct 13, 2017

We may have to check the documentation, as I think it mentions the default 10 seconds in some places

@darstahl

This comment has been minimized.

Show comment
Hide comment
@darstahl

darstahl Oct 16, 2017

Contributor

Looks like some CI tests are checking that the logs contain the shutdown time specified in the config.

The log message they look for was never correct, as it just read from the config, while there is a more complicated shutdown time computation used for the actual shutdown timeout. I've changed the log from:

level=debug msg="start clean shutdown of all containers with a 3 seconds timeout..."

to:

level=debug msg="daemon configured with a 3 seconds minimum shutdown timeout"
level=debug msg="start clean shutdown of all containers with a 15 seconds timeout..."

(Assuming a 3 second configured shutdown timeout, and a calculated actual shutdown timeout of 15 seconds due to a 10 second container shutdown timeout and a 5 second shutdown grace period)

I haven't looked at docs yet, will do soon

Contributor

darstahl commented Oct 16, 2017

Looks like some CI tests are checking that the logs contain the shutdown time specified in the config.

The log message they look for was never correct, as it just read from the config, while there is a more complicated shutdown time computation used for the actual shutdown timeout. I've changed the log from:

level=debug msg="start clean shutdown of all containers with a 3 seconds timeout..."

to:

level=debug msg="daemon configured with a 3 seconds minimum shutdown timeout"
level=debug msg="start clean shutdown of all containers with a 15 seconds timeout..."

(Assuming a 3 second configured shutdown timeout, and a calculated actual shutdown timeout of 15 seconds due to a 10 second container shutdown timeout and a 5 second shutdown grace period)

I haven't looked at docs yet, will do soon

@vdemeester

LGTM 🐯

@cpuguy83

LGTM

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Oct 20, 2017

Contributor

I have a feeling this will cause a merge conflict with #34895.
I'd like to hold off merging this until that is in (assuming we can get that one in today).

Contributor

cpuguy83 commented Oct 20, 2017

I have a feeling this will cause a merge conflict with #34895.
I'd like to hold off merging this until that is in (assuming we can get that one in today).

@yongtang

This comment has been minimized.

Show comment
Hide comment
@yongtang

yongtang Oct 23, 2017

Member

@darrenstahlmsft can you do a rebase? I think we are safe to merge once the PR is rebased.

Member

yongtang commented Oct 23, 2017

@darrenstahlmsft can you do a rebase? I think we are safe to merge once the PR is rebased.

Increase container default shutdown timeout on Windows
The shutdown timeout for containers in insufficient on Windows. If the daemon is shutting down, and a container takes longer than expected to shut down, this can cause the container to remain in a bad state after restart, and never be able to start again. Increasing the timeout makes this less likely to occur.

Signed-off-by: Darren Stahl <darst@microsoft.com>
@darstahl

This comment has been minimized.

Show comment
Hide comment
@darstahl

darstahl Oct 23, 2017

Contributor

Rebased.

Contributor

darstahl commented Oct 23, 2017

Rebased.

@yongtang

This comment has been minimized.

Show comment
Hide comment
@yongtang

yongtang Oct 23, 2017

Member

@darrenstahlmsft Thanks! 👍

Member

yongtang commented Oct 23, 2017

@darrenstahlmsft Thanks! 👍

@yongtang yongtang merged commit 7848b8b into moby:master Oct 23, 2017

6 checks passed

dco-signed All commits are signed
experimental Jenkins build Docker-PRs-experimental 37476 has succeeded
Details
janky Jenkins build Docker-PRs 46167 has succeeded
Details
powerpc Jenkins build Docker-PRs-powerpc 6562 has succeeded
Details
windowsRS1 Jenkins build Docker-PRs-WoW-RS1 17749 has succeeded
Details
z Jenkins build Docker-PRs-s390x 6366 has succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment