New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to avoid issues when the Docker daemon restarts or stops on RHEL/CentOS 6 #8094
Conversation
Componentdocker-io-1.1.2-1.el6.x86_64 Hardware Platformx86_64 PlatformCentOS 6.5 SummaryOn restart/stop, docker can lose track of currently mounted dm devices on long-running container shutdowns. DetailsThe docker daemon can lose track of thin mounted device mapper volumes if the docker daemon is killed. The RHEL 6 initscript uses killproc -p to stop the Docker daemon (see https://github.com/docker/docker/blob/v1.1.2/contrib/init/sysvinit-redhat/docker#L71) This can occur if a process takes longer than the killproc timeout to exit, or if the docker daemon receives a SIG_KILL. When the daemon is restarted, docker will be unable to restart any containers that were still active after the docker daemon dies. ReproducibilityEvery time Steps to ReproduceGiven a CentOS 6.5 instance with docker-io-1.1.2 installed:
Actual ResultsWhat happened when you reached the bug?
Expected ResultsWhat do you think was supposed to happen?
WorkaroundAfter verifying the contained processes have exited, identify the affected device mapper ID by using
|
c5e7430
to
36dbc4b
Compare
36dbc4b
to
ce07932
Compare
This change will allow the Docker daemon's init script to wait up to 5 minutes before being forcibly terminated by the initscript. Many non-trivial containers will take more than the default 3 seconds to stop, which can result in containers whose rootfs is still mounted and will not restart when the daemon starts up again, or worse, orphan processes that are still running. Signed-off-by: Steven Merrill <steven.merrill@gmail.com>
ce07932
to
640d2ef
Compare
@maxamillion @jperrin ... could you check if this is good for merge here, and also for the el6 rpm? |
Yeah, I'm okay with this for now. It's a valid point, although I think there might be a better way to solve this long term. This is a good short term fix. |
It is an interesting problem, because the docker daemon will obviously also try to shut down all running containers when it gets the signal to shut down, and it will probably always know better than a shell script how to do that properly, especially since parts of the daemon can register additional tasks with eng.onShutdown(). I had originally thought of putting something in the init script that would loop through |
Seems hacky, but LGTM. You're good for a merge on this then, @lsm5? |
@tianon yup |
Try to avoid issues when the Docker daemon restarts or stops on RHEL/CentOS 6
This change will allow the Docker daemon's init script to wait up to 5
minutes before being forcibly terminated by the initscript. Many
non-trivial containers will take more than the default 3 seconds to
stop, which can result in containers whose rootfs is still mounted and
will not restart when the daemon starts up again, or worse, orphan
processes that are still running.