-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker(containerd) miss container rm command #39047
Comments
My point of view and questions: I think the function which surprise me is this: func (daemon *Daemon) Kill(container *containerpkg.Container) error {
if !container.IsRunning() {
return errNotRunning(container.ID)
}
// 1. Send SIGKILL
if err := daemon.killPossiblyDeadProcess(container, int(syscall.SIGKILL)); err != nil {
// While normally we might "return err" here we're not going to
// because if we can't stop the container by this point then
// it's probably because it's already stopped. Meaning, between
// the time of the IsRunning() call above and now it stopped.
// Also, since the err return will be environment specific we can't
// look for any particular (common) error that would indicate
// that the process is already dead vs something else going wrong.
// So, instead we'll give it up to 2 more seconds to complete and if
// by that time the container is still running, then the error
// we got is probably valid and so we return it to the caller.
if isErrNoSuchProcess(err) {
return nil
}
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if status := <-container.Wait(ctx, containerpkg.WaitConditionNotRunning); status.Err() != nil {
return err
}
}
// 2. Wait for the process to die, in last resort, try to kill the process directly
if err := killProcessDirectly(container); err != nil {
if isErrNoSuchProcess(err) {
return nil
}
return err
}
// Wait for exit with no timeout.
// Ignore returned status.
<-container.Wait(context.Background(), containerpkg.WaitConditionNotRunning)
return nil
} There are 2 things I cannot understand: 1.For the failed 2.It continued to call |
Continued: After this issue happened, any command of |
Description
Test group report
docker rm
command failed for about a half year, version from 17.03 to 18.09.0.Steps to reproduce the issue:
No way to reproduce, but the problem happen nearly every day
Describe the results you received:
docker stuck in
rm
, like:and
docker ps
like:Describe the results you expected:
rm container correctly
Additional information you deem important (e.g. issue happens only occasionally):
I've add some log info in docker/dockerd/containerd/containerd-shim/runc, here is what I got for now:
NOTE: the log msg which contains [fd], is added by myself.
1. docker:
correctly receive 7
docker rm
commands, but only get 6 returns fromfunc runRm(dockerCli command.Cli, opts *rmOptions) error
:2. dockerd:
get 7 calls of
func (daemon *Daemon) Kill(container *containerpkg.Container) error
, 6 returns willnil
at the end of the fuction, 1 returnsnil
at here because of a failedKILL -9
command, as:and, I got 6 events of both
topic=/tasks/exit
andtopic=/tasks/delete
:x 6, and:
x 6
3. containerd:
containerd get only 6 calls of
func (l *local) Delete(ctx context.Context, req *api.DeleteContainerRequest, _ ...grpc.CallOption) (*ptypes.Empty, error)
, all return successfully:4. containerd-shim:
same as containerd, only get 6 calls of
func (s *Service) Delete(ctx context.Context, r *ptypes.Empty) (*shimapi.DeleteResponse, error)
, all succeed.log skipped...
5. runc
get serval kill calls to that failing container.
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
The text was updated successfully, but these errors were encountered: