Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Swarm update killing old containers too early before container is fully initialized #35881

Closed
pjebs opened this issue Dec 27, 2017 · 24 comments

Comments

@pjebs
Copy link

pjebs commented Dec 27, 2017

My service has a DOCKERFILE which looks like this at the end:

HEALTHCHECK --interval=1m --timeout=3s --start-period=45s \
  CMD curl -f http://localhost/ || exit 1

COPY init.sh /

ENTRYPOINT chmod a+x /init.sh && /init.sh

The webserver is initialized at the end of the initialization script after many other things are done before hand (such as git-pull the new code for my PHP project).

I have noticed that docker swarm update --force XXX seems to kill the old container and assume the new container is good to go BEFORE my init.sh script is finished.

This means my website has a blackout for a minute or 2 until the init script finishes loading everything.

How can I solve this?

@pjebs pjebs changed the title Docker Swarm update Docker Swarm update killing old containers too early Dec 27, 2017
@chanwit
Copy link

chanwit commented Dec 27, 2017

Have you tried --update-order start-first and increase the waiting time with --update-monitor, which is 5s by default.

@pjebs
Copy link
Author

pjebs commented Jan 6, 2018

@chanwit your suggestion is based on a time delay. I need docker swarm to be notified that the container is "initialised and ready" after my init script is finished. Only then should docker swarm start killing off the old containers.

@pjebs
Copy link
Author

pjebs commented Jan 9, 2018

@thaJeztah Is there another way to achieve this? Perhaps "the docker" way? Notifying that a container is ready and initialized seems like something that goes without saying.

@thaJeztah
Copy link
Member

So, first of all; scaling your service would resolve the primary issue: if you only have a single instance of a service, it means you have no redundancy, so when that one instance of the service goes down, your service is down. The --update-order and --update-parallelism options may help as well; but a "blue/green" deployment could be something to consider (see this repository for illustrating how to do that in Docker)

The webserver is initialized at the end of the initialization script after many other things are done before hand (such as git-pull the new code for my PHP project).

This really sounds like the wrong way of doing things: how are you able to tell if your container will work at all? Fetching/updating the code for your service at runtime is risky, and throws away one of the major advantages of using containers; providing a reproducible environment for your service.

PHP is an interpreted language, but doing a git-pull of your PHP code would be the equivalent of "compiling a new binary from source" in other languages, which is not something you'd want to do on your production server.

Containers should be treated immutable (where possible), and Instead of fetching the source code at runtime, this should be done beforehand; during docker build.

Building an image from source (in your case; adding PHP code, fetching dependencies etc) allows you to (build,) test, and run your image before you deploy it: you can verify the image, have it scanned for vulnerabilities, and know exactly what code will be deployed. It also allows you to revert to a previous version of your code by using a previous version of the image.

And if you use multi-stage builds, you can keep your images minimal, and keep tools that are only needed during build out of the final image (reducing the risk of deploying an image that contains vulnerabilities), for example:

# the build-stage
FROM some-image AS build-stage

RUN apt-get update && apt-get install <some build tools>

# copy your site's source files to the image (assuming the Dockerfile 
# is kept in source control together with your PHP source - doing so
# doesn't require you to `git clone` your source code, and makes
# the build predictable.
COPY . /src

RUN build your site, cleanup things, optimize, etc


# start with a clean php/apache image for the final image
FROM php:7-apache

# add the site to this image
COPY --from build-stage /site /var/www/html

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

I have a build script that installs all the required software.
The init script finishes off the process because it requires certain environment variables which are not available when docker building. Those environment variables determines which repository to pull from (amongst other things).

  • The docker swarm solution does not require me to have multiple instances or multiple replicas (in fact I specifically need replica=1)
  • This is a really basic requirement. It definitely goes without saying - presumably the exact reason why you introduced Healthcheck feature.
  • But for me, the healthcheck feature does not solve problem because I have a loadbalancer global service the routes requests to this service, so curl localhost. In fact, I had to remove the healthcheck statement.
  • Other competing solutions does this with ease.

This really sounds like the wrong way of doing things: how are you able to tell if your container will work at all? Fetching/updating the code for your service at runtime is risky, and throws away one of the major advantages of using containers; providing a reproducible environment for your service.

That's what the healthcheck is for. If after the init script is complete (and it signals to docker that the container is ready and initialised) and healthcheck fails, then you know container failed. Quite simple.

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

You mentioned increasing replica count. That doesn't do anything because Docker swarm kills off all the old containers and loads new containers before they are fully initialised.

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

Your analogy comparing PHP and a compiled binary is also not quite the same because I need to pull a different PHP codebase based on the environment variable. If I could docker-build based on environment variables, that would be amazing but despite numerous complaints such as: #6822 (comment), I accept Docker's reasoning as sound. So your suggestion of doing everything at the Docker build stage is not adequate.

@pjebs pjebs changed the title Docker Swarm update killing old containers too early Docker Swarm update killing old containers too early before container is fully initialized Jan 10, 2018
@thaJeztah
Copy link
Member

Those environment variables (amongst other things) determines which repository to pull from.
..

Your analogy comparing PHP and a compiled binary is also not quite the same because I need to pull a different PHP codebase based on the environment variable.

Why is it different? You're building an application (website) from a different code base; the only difference with a compiled binary is that it doesn't "compile" a binary. If your code uses (e.g.) composer, or (idk) generates stylesheets, you're building an application.

The docker swarm solution does not require me to have multiple instances or multiple replicas (in fact I specifically need replica=1)

Swarm does not help you there; it will create a new instance of the service if the service becomes unhealthy, but during that time, the service will be down (because there's no redundancy)

That's what the healthcheck is for. If after the init script is complete (and it signals to docker that the container is ready and initialised) and healthcheck fails, then you know container failed. Quite simple.

A health check checks if the service is "healthy", but it's not a replacement for testing that the application works without issues; a file may be missing, a bug may be in one of your .php files, and the only instance of exactly that version of the code is now in that container, not in any image.

You mentioned increasing replica count. That doesn't do anything because Docker swarm kills off all the old containers and loads new containers before they are fully initialised.

Have you tried the --update-order=start-first option, and set --update-parallelism=1 ?

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

--update-order=start-first the documentation does not explain clearly what it does

@thaJeztah
Copy link
Member

start-first creates the new instance, before killing the old one; see #30261, #31955

the documentation does not explain clearly what it does

hm, yes, looks like documentation is sparse on that; I opened an issue for that; docker/cli#795

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

Isn't that the default? In any case, that's of no use to me.

@thaJeztah
Copy link
Member

Isn't that the default?

No, it's not the default: not all use-cases can handle multiple instances being started; the default is "stop-first"; see the output of docker service create --help;

      --update-order string                Update order ("start-first"|"stop-first") (default "stop-first")

In any case, that's of no use to me.

Can you elaborate why it's of no use to you? Because I think this is what you're asking for:

Here's a simple example;

Build version 1 of our image

docker build -t myapp -<<EOF
FROM nginx:alpine
RUN echo 'VERSION 1' > /usr/share/nginx/html/index.html
EOF

Deploy the service, with a health-check (this healthcheck will become "healthy" after a number of tries, illustrating your "installation" steps of the container):

docker service create --name=example-start-first \
  --health-cmd='if [ ! -f "/count" ] ; then ctr=0; else ctr=`cat /count`; fi; ctr=`expr ${ctr} + 1`; echo "${ctr}" > /count; if [ "$ctr" -gt 2 ] ; then exit 0; else exit 1; fi' \
  --health-interval=10s \
  --health-timeout=3s \
  --health-retries=3 \
  --health-start-period=60s \
  --update-order=start-first \
  --update-parallelism=1 \
  -p8080:80 \
  myapp:latest

Meanwhile, in another shell, try connecting to the service (watch curl localhost:8080), which will fail to connect because the service is not healthy yet (thus no traffic routed to it);

Every 2.0s: curl localhost:8080

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 8080: Connection refused

After 40 seconds or so, the service becomes healthy, and connection succeeds, showing VERSION 1:

Every 2.0s: curl localhost:8080

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100    10  100    10    0     0   1381      0 --:--:-- --:--:-- --:--:--  1428
VERSION 1

Now, build a new version of the image (to illustrate the new version being deployed);

docker build -t myapp -<<EOF
FROM nginx:alpine
RUN echo 'VERSION 2' > /usr/share/nginx/html/index.html
EOF

And update the service (I tagged both images :latest, and use --force here to reproduce your original description);

docker service update --force example-start-first

In the second shell (which is still running watch curl localhost:8080), you'll see that Docker keeps serving VERSION 1

Every 2.0s: curl localhost:8080

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100    10  100    10    0     0   1381      0 --:--:-- --:--:-- --:--:--  1428
VERSION 1

Until (after 40 seconds) the updated instance becomes healthy, at which point traffic is no longer routed to the old instance, but now cut-over to the new one;

Every 2.0s: curl localhost:8080

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100    10  100    10    0     0   1410      0 --:--:-- --:--:-- --:--:--  1428
VERSION 2

At that point, the original instance is stopped;

docker service ps example-start-first

ID                  NAME                        IMAGE               NODE                    DESIRED STATE       CURRENT STATE                ERROR               PORTS
kxdtb4p9lzfp        example-start-first.1       myapp:latest        linuxkit-025000000001   Running             Running about a minute ago                       
3a6448vf2ybg         \_ example-start-first.1   myapp:latest        linuxkit-025000000001   Shutdown            Shutdown 58 seconds ago                          

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

Thank you for your detailed assistance but unfortunately it still isn't applicable to me because you did a second docker build to represent the updated code base.

My repo path for code base entirely changes hence its defined by a environment variable.

The only way to reconcile your method is to:

  • base Dockerfile with everything installed
  • a template Dockerfile inheriting base Dockerfile and then substituting env variable
  • docker build template Dockerfile after substitutions are applied
  • have a program on docker host that periodically 'rmi -f non base image'

@pjebs
Copy link
Author

pjebs commented Jan 10, 2018

Correction:

  • have a base Dockerfile with everything installed
  • template Dockerfile where I substitute env variables and then git pull new php code base

@thaJeztah
Copy link
Member

Thank you for your detailed assistance but unfortunately it still isn't applicable to me because you did a second docker build to represent the updated code base.
...
The only way to reconcile your method is to:

Have you tried? The different images is just for illustration; updating an environment variable is even easier as it wouldn't require --force to be used (but can still be used);

Take this as image instead:

docker build -t myapp -<<EOF
FROM nginx:alpine
ENV VERSION=default
CMD echo hello \$VERSION > /usr/share/nginx/html/index.html; exec nginx -g 'daemon off;'
EOF

Deploy the service, using VERSION=1 for the environment variable;

docker service create --name=example-start-first \
  --env=VERSION=1 \
  --health-cmd='if [ ! -f "/count" ] ; then ctr=0; else ctr=`cat /count`; fi; ctr=`expr ${ctr} + 1`; echo "${ctr}" > /count; if [ "$ctr" -gt 2 ] ; then exit 0; else exit 1; fi' \
  --health-interval=10s \
  --health-timeout=3s \
  --health-retries=3 \
  --health-start-period=60s \
  --update-order=start-first \
  --update-parallelism=1 \
  -p8080:80 \
  myapp:latest

Verify that the service is running (i.e. it prints hello 1);

Every 2.0s: curl localhost:8080

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100     8  100     8    0     0    777      0 --:--:-- --:--:-- --:--:--   800
hello 1

Update the service, update the VERSION environment variable;

docker service update --env-add=VERSION=2 --force example-start-first

And, after 40 seconds, see that it prints hello 2;

Every 2.0s: curl localhost:8080

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100     8  100     8    0     0   1309      0 --:--:-- --:--:-- --:--:--  1333
hello 2

@BretFisher
Copy link

Yea maybe said a different way on your desire to have zero downtime of a service update @pjebs:

If you had docker service create --name webapp your-php-image, which means it only has one replica, let's say you then did a docker service update --force webapp. You will have downtime. That is by design:

  1. The default is for service update to take down your existing task (container) before starting the new one.
  2. A healthcheck won't change this default behavior.

To ensure when you do a service update that it has zero downtime with only a single replica, you have to (at a minimum):

  1. Add --update-order start-first to the service update command so that the service definition knows that you want to start a 2nd task while 1st is still running.
  2. Ensure you have the healthcheck enabled like your Dockerfile shows so that swarm truly knows when the new container is "Running".

If you have those two things set in the service update then this is ruffly what will happen:

  1. Swarm will schedule a new task, with it's desired state set to Ready.
  2. The node with the new task will pull any updated image and create the container.
  3. It will start the container and wait for it to be Healthy state.
  4. Once Healthy, Swarm adds the new container to the load balancer VIP.
  5. There are now two tasks receiving traffic from the load balancer.
  6. Swarm will set the old task desired state to Shutdown.
  7. The node will the old task will start the process of shutting down the container, beginning with taking it out of the load balancer.

Does this make sense? Have you tried it this way?

@pjebs
Copy link
Author

pjebs commented Jan 30, 2018

I'll be looking into this issue and the submitted solutions in a few weeks. Got eye strain issues and can't use computer much

@pjebs
Copy link
Author

pjebs commented Apr 25, 2018

@thaJeztah I tried out your solution. It's definitely the correct approach.

What's your suggestion on how to automatically delete the image since after rebuilding the image with a later version of the php application, the old image needs to be deleted.

@BretFisher
Copy link

@pjebs you'd need to use the prune command on each node, which you can do from a service once a day with something like:

docker service create --name prune-images --mode global \
  --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
docker sh -c "while true; do docker image prune -af; sleep 86400; done"

@pjebs
Copy link
Author

pjebs commented May 5, 2018

This seems relevant:
#30321

@pjebs pjebs closed this as completed May 5, 2018
@pjebs
Copy link
Author

pjebs commented May 6, 2018

@BretFisher

docker build -t myapp -<<EOF
FROM nginx:alpine
ENV VERSION=default
CMD echo hello \$VERSION > /usr/share/nginx/html/index.html; exec nginx -g 'daemon off;'
EOF

Is there a reason build ARGs aren't used in this scenario? They seem like the exact use case.

@pascalandy
Copy link

pascalandy commented May 27, 2018

This is a great way to do a docker prune! @BretFisher

I'm using a standard cron at the moment.
It's one of those quick wins that deserves a blog post :-p

docker service create \
--name cron-docker-prune \
--mode global \
--mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
docker sh -c "while true; do docker image prune -af; sleep 86400; done"

about --name, of course it's not a cron, but it acts like one :-p

@thaJeztah
Copy link
Member

thaJeztah commented May 28, 2018

If you don't want to update the container's command, or want to create a "cron" for minimal containers that don't have a shell, or sleep command, you can also use the restart-policy for swarm services;

For example, the example below creates a service for which Docker spins up a new task every 30 seconds. The container's command is not a long-running process, so the container will exit directly after. Docker (SwarmKit) will notice the container exits, and because of that, retries after 30 seconds (--restart-delay).

When deploying the service from the command line, use the --detach option to disable the "interactive" deploy (otherwise, you'll be waiting for the "service to reconcile", because it never keeps running 😂);

docker service create \
  --detach \
  --restart-delay=30s \
  --name cronnie \
  busybox date '+%Y-%m-%d %H:%M:%S doing my thing'

Checking logs for the service above shows something like;

docker service logs -f cronnie

cronnie.1.zi8k2zu530tb@linuxkit-025000000001    | 2018-05-28 08:39:41 doing my thing
cronnie.1.r9r9w06w0rrl@linuxkit-025000000001    | 2018-05-28 08:40:12 doing my thing
cronnie.1.xtlmhmo7xyzs@linuxkit-025000000001    | 2018-05-28 08:40:43 doing my thing
cronnie.1.s0hiu7b3vgkq@linuxkit-025000000001    | 2018-05-28 08:41:14 doing my thing
cronnie.1.j3s2gdrf4jmp@linuxkit-025000000001    | 2018-05-28 08:41:45 doing my thing
cronnie.1.pwqzxvj7fi6k@linuxkit-025000000001    | 2018-05-28 08:42:16 doing my thing

@pascalandy
Copy link

pascalandy commented Jun 5, 2018

Hi,
I just test this carefully and it works great!!

docker service update \
	--update-order=start-first \
	--image devmtl/ghostfire:"$EDGE_SHA" \
	$service-name;

Updating a web service behind Traefik.

If I hit the webpage during this step overall progress:

overall progress: 1 out of 1 tasks
1/1:
verify: Service converged

... I get one bad request (404), and then it's all good. I feel the issue is that my reverse-proxy (Traefik 1.6.2) doesn't have the time to catch up. So when the request hits to the previous service (404), Traefik refresh all services and find the newest VIP address, and the next request is properly served.

It there a way to force Traefik to update the VIP "table" at this very moment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants