Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.0 backport] daemon: daemon.containerRestart: don't cancel restart on context cancel #46697

Merged
merged 1 commit into from Oct 26, 2023

Conversation

thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Oct 21, 2023

commit def549c (#44365) passed through the context to the daemon.ContainerStart function. As a result, restarting containers no longer is an atomic operation, because a context cancellation could interrupt the restart (between "stopping" and "(re)starting"), resulting in the container being stopped, but not restarted.

Restarting a container, or more factually; making a successful request on the /containers/{id]/restart endpoint, should be an atomic operation.

This patch uses a context.WithoutCancel for restart requests.

It's worth noting that daemon.containerStop already uses context.WithoutCancel, so in that function, we'll be wrapping the context twice, but this should likely not cause issues (just redundant for this code-path).

Before this patch, starting a container that bind-mounts the docker socket, then restarting itself from within the container would cancel the restart operation. The container would be stopped, but not started after that:

docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'

docker ps -a
CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS                       PORTS     NAMES
3a2a741c65ff   docker:cli    "docker-entrypoint.s…"   26 seconds ago   Exited (128) 7 seconds ago             myself

With this patch: the stop still cancels the exec, but does not cancel the restart operation, and the container is started again:

docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
docker exec myself sh -c 'docker restart myself'
docker ps
CONTAINER ID   IMAGE        COMMAND                  CREATED              STATUS         PORTS     NAMES
4393a01f7c75   docker:cli   "docker-entrypoint.s…"   About a minute ago   Up 4 seconds             myself

- What I did

- How I did it

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah thaJeztah added this to the 24.0.7 milestone Oct 21, 2023
@thaJeztah thaJeztah force-pushed the 24.0_backport_restart_nocancel branch 2 times, most recently from 20c797b to 54e6b2b Compare October 21, 2023 13:44
@thaJeztah
Copy link
Member Author

thaJeztah commented Oct 24, 2023

Some changes compared to the PR in master, due to things not yet being in the 24.0 branch;

git diff
diff --git a/integration/container/restart_test.go b/integration/container/restart_test.go
index cf4e0c4975..bceb605c34 100644
--- a/integration/container/restart_test.go
+++ b/integration/container/restart_test.go
@@ -9,7 +9,6 @@ import (

        "github.com/docker/docker/api/types"
        "github.com/docker/docker/api/types/container"
-       "github.com/docker/docker/api/types/events"
        "github.com/docker/docker/api/types/filters"
        "github.com/docker/docker/client"
        testContainer "github.com/docker/docker/integration/internal/container"
@@ -219,11 +218,10 @@ func TestContainerWithAutoRemoveCanBeRestarted(t *testing.T) {
 //
 // Regression test for https://github.com/moby/moby/discussions/46682
 func TestContainerRestartWithCancelledRequest(t *testing.T) {
-       ctx := setupTest(t)
+       ctx := context.TODO()
+       defer setupTest(t)()
        apiClient := testEnv.APIClient()

-       testutil.StartSpan(ctx, t)
-
        // Create a container that ignores SIGTERM and doesn't stop immediately,
        // giving us time to cancel the request.
        //
@@ -233,7 +231,7 @@ func TestContainerRestartWithCancelledRequest(t *testing.T) {
        // taking place.
        cID := testContainer.Run(ctx, t, apiClient, testContainer.WithCmd("sh", "-c", "trap 'echo received TERM' TERM; while true; do usleep 10; done"))
        defer func() {
-               err := apiClient.ContainerRemove(ctx, cID, container.RemoveOptions{Force: true})
+               err := apiClient.ContainerRemove(ctx, cID, types.ContainerRemoveOptions{Force: true})
                if t.Failed() && err != nil {
                        t.Logf("Cleaning up test container failed with error: %v", err)
                }
@@ -243,7 +241,7 @@ func TestContainerRestartWithCancelledRequest(t *testing.T) {
        messages, errs := apiClient.Events(ctx, types.EventsOptions{
                Filters: filters.NewArgs(
                        filters.Arg("container", cID),
-                       filters.Arg("event", string(events.ActionRestart)),
+                       filters.Arg("event", "restart"),
                ),
        })

@@ -270,7 +268,7 @@ func TestContainerRestartWithCancelledRequest(t *testing.T) {
        select {
        case m := <-messages:
                assert.Check(t, is.Equal(m.Actor.ID, cID))
-               assert.Check(t, is.Equal(m.Action, events.ActionRestart))
+               assert.Check(t, is.Equal(m.Action, "restart"))
        case err := <-errs:
                assert.NilError(t, err)
        case <-time.After(restartTimeout):

@thaJeztah thaJeztah force-pushed the 24.0_backport_restart_nocancel branch from 54e6b2b to 3c91c0a Compare October 24, 2023 14:28
commit def549c passed through the context
to the daemon.ContainerStart function. As a result, restarting containers
no longer is an atomic operation, because a context cancellation could
interrupt the restart (between "stopping" and "(re)starting"), resulting
in the container being stopped, but not restarted.

Restarting a container, or more factually; making a successful request on
the `/containers/{id]/restart` endpoint, should be an atomic operation.

This patch uses a context.WithoutCancel for restart requests.

It's worth noting that daemon.containerStop already uses context.WithoutCancel,
so in that function, we'll be wrapping the context twice, but this should
likely not cause issues (just redundant for this code-path).

Before this patch, starting a container that bind-mounts the docker socket,
then restarting itself from within the container would cancel the restart
operation. The container would be stopped, but not started after that:

    docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
    docker exec myself sh -c 'docker restart myself'

    docker ps -a
    CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS                       PORTS     NAMES
    3a2a741c65ff   docker:cli    "docker-entrypoint.s…"   26 seconds ago   Exited (128) 7 seconds ago             myself

With this patch: the stop still cancels the exec, but does not cancel the
restart operation, and the container is started again:

    docker run -dit --name myself -v /var/run/docker.sock:/var/run/docker.sock docker:cli sh
    docker exec myself sh -c 'docker restart myself'
    docker ps
    CONTAINER ID   IMAGE        COMMAND                  CREATED              STATUS         PORTS     NAMES
    4393a01f7c75   docker:cli   "docker-entrypoint.s…"   About a minute ago   Up 4 seconds             myself

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit aeb8972)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah force-pushed the 24.0_backport_restart_nocancel branch from 3c91c0a to 05d7386 Compare October 24, 2023 14:29
@thaJeztah thaJeztah marked this pull request as ready for review October 26, 2023 07:50
@thaJeztah thaJeztah merged commit 311b9ff into moby:24.0 Oct 26, 2023
101 of 102 checks passed
@thaJeztah thaJeztah deleted the 24.0_backport_restart_nocancel branch October 26, 2023 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants