Docker restart failed on rare cases and then container never start again. #47549
Labels
kind/bug
Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
status/0-triage
version/20.10
Description
Docker restart failed on rare cases and then container never start again.
Reproduce
docker restart container.
On rare cases, SIGKILL take time.
Expected behavior
The container should be brought up.
docker version
docker info
Additional Info
Log
time="2024-03-02T15:58:19.549568691Z" level=info msg="Container failed to exit within 1m30s of signal 15 - using the force" container=38d614925484c56efcba34aa3a3f25259e2f2acc4fbb00b6a550051942c7505f
time="2024-03-02T15:58:29.579026954Z" level=error msg="Container failed to exit within 10 seconds of kill - trying direct SIGKILL" container=38d614925484c56efcba34aa3a3f25259e2f2acc4fbb00b6a550051942c7505f error="context deadline exceeded"
time="2024-03-02T15:58:33.580895749Z" level=error msg="Error killing the container" container=38d614925484c56efcba34aa3a3f25259e2f2acc4fbb00b6a550051942c7505f error="tried to kill container, but did not receive an exit event"
time="2024-03-02T15:58:33.588485246Z" level=error msg="Handler for POST /containers/MySQL/restart returned error: Cannot restart container MySQL: tried to kill container, but did not receive an exit event"
time="2024-03-02T15:58:51.564412478Z" level=info msg="ignoring event" container=38d614925484c56efcba34aa3a3f25259e2f2acc4fbb00b6a550051942c7505f module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
time="2024-03-02T15:58:51.564448059Z" level=info msg="shim disconnected" id=38d614925484c56efcba34aa3a3f25259e2f2acc4fbb00b6a550051942c7505f
time="2024-03-02T15:58:51.564496272Z" level=warning msg="cleaning up after shim disconnected" id=38d614925484c56efcba34aa3a3f25259e2f2acc4fbb00b6a550051942c7505f namespace=moby
time="2024-03-02T15:58:51.564504721Z" level=info msg="cleaning up dead shim"
time="2024-03-02T15:58:51.570746324Z" level=warning msg="cleanup warnings time="2024-03-02T15:58:51Z" level=info msg="starting signal loop" namespace=moby pid=3996640 runtime=io.containerd.runc.v2\n"
Code path
In
It called
Then it called
Then it called
Then it called
Then it returns error
In the restart error handling.
Proposal
The restart can fail due to kill container, the worst thing is that it will nerver bring up the container again.
So one solution may be:
When restart find the error is "tried to kill container, but did not receive an exit event", let it wait the container be killed then start the container
The text was updated successfully, but these errors were encountered: