Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container: add a timeout for deletion #47713

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Apr 24, 2024

  1. container: add a timeout for deletion

    On a heavily loaded host, we were experiencing long container(d) deletion
    times:
    
        containerd[8907]: time="2024-03-25T13:47:45.938479195Z" level=debug msg="event forwarded" ns=moby topic=/tasks/exit type=containerd.events.TaskExit
        # our control plane logic deletes the successfully exited container via
        # the docker API, and...
        containerd[8907]: time="2024-03-25T13:47:47.202055216Z" level=debug msg="failed to delete task" error="context deadline exceeded" id=a618057629b35e3bfea82d5ce4cbb057ba979498496428dfe6935a1322b94add
    
    Before 4bafaa0 ("Refactor libcontainerd to minimize c8d RPCs") when
    this happens, the docker API reports a 255 exit code and no error:
    
        0a7ddd027c0497d5a titus-executor-[900884]: Processing msg from a docker: main container exited with code 255
    
    which is especially confusing. After 4bafaa0, the behavior has changed
    to report the container's real exit code, although there is still a hard
    coded timeout after which containerd will (try to) stop cleaning up. We
    would like to wait for this cleanup, so let's add a user configurable
    DeleteTimeout here.
    
    Reported-by: Hechao Li <hli@netflix.com>
    Signed-off-by: Tycho Andersen <tandersen@netflix.com>
    tych0 committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    9463a5e View commit details
    Browse the repository at this point in the history