Skip to content

depends_on Should Obey {{ State.Healthcheck.Status }} Before Launching Services #3754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
midnightconman opened this issue Jul 21, 2016 · 26 comments
Milestone

Comments

@midnightconman
Copy link

Is it possible for the depends_on parameter to wait for a service to be in a "Healthy" state before starting services, if a healthcheck exists for the container? At the moment in 1.8.0-rc2 the depends_on parameter only verifies that the container is in a running state, via {{ State.Status.running }}. This would allow for better dependency management in docker-compose.

@aanand
Copy link

aanand commented Jul 22, 2016

Yes, we should probably do this. It will involve:

  • Adding support for both setting the health check options for a container and querying its health state to docker-py
  • Adding support for those options (command, interval, retries, timeout) to the Compose file, probably under a health_check key
  • Updating the logic of depends_on to query containers' health state before considering a dependency to be ready

There are some important questions to answer, e.g.

  • What qualifies a service (i.e. multiple containers) as healthy? Is it healthy if and only if all its containers are healthy?
  • Should containers not join networks until they are healthy?
  • etc

@midnightconman
Copy link
Author

I like the idea of not joining a network until healthy also, that would allow for auto joining of containers via the service dns record. Is it possible to also have a mechanism where containers would leave the service dns record if the healthcheck is failing? I think this portion might be a question for docker engine.

@aanand
Copy link

aanand commented Jul 22, 2016

Yes, that's the problem with implementing it in Compose - its state can't be monitored. I think that suggests we shouldn't look at a container's health before connecting it to the network.

@gittycat
Copy link

I think that the health state of a service is separate from whether it is connected to the network. A service might need to connect to other services first before it becomes healthy.

The health check should also be done at the service level, not container level. It makes more sense now to think in terms of Services than containers, specially with Docker 1.12 (Service has become a 1st class citizen). A consumer of the service should not have to care that the service is made up of 1,2 or dozens of containers. What counts is that the service as a whole is considered ready to accept requests.
This means that each service would need some centralised way for its containers to report their status to. The DNS resolver is already used to report the IP of each service. It could probably be used to keep the health status as well.

@aanand
Copy link

aanand commented Aug 5, 2016

Quoting @dnephin in #3815 (comment):

I haven't tried out healthchecks yet. Is there some api call we can use to do "wait on container state == HEALTHY"?

If that exits, I think it would make sense to use it on depends_on. I'd rather not build polling into Compose to handle it.

I agree, so next step here would be to open a PR against Engine to implement that functionality if it doesn't already exist. It's important to get it into Engine first and as early as possible, because we have a policy of supporting at least two versions of Engine. If we can get the API endpoint into Engine 1.13, then we can get healthcheck support into Compose 1.10.

@ColinHebert
Copy link

What about using the event system to do that?

You can filter based on event="health_status: healthy"
https://docs.docker.com/engine/reference/api/docker_remote_api_v1.24/#/monitor-docker-s-events

@earthquakesan
Copy link

I tweaked service.py and parallel.py and was able to make one container wait for another until it is healthy.

Basically every container, which has dependencies on other containers (as I see it dependencies are inferred from links, volumes_from, depends_on... --> get_dependency_names() method -- line 519 in service.py) will wait until those containers are healthy.

Regarding API, docker-compose uses docker-py and health check can be performed as follows (in service.py):

    def is_healthy(self):
        try:
            status = self.client.inspect_container(self.name)["State"]["Health"]["Status"]
            if status == "healthy":
                return True
            else:
                return False
        except docker.error.NotFound as e:
            return False
        #didn't test --> this bit is necessary for backward compatibility, otherwise it will not work with previous docker API --> need more specific exception class too
        #except BaseException as e:
            #If API does not support healthchecks, just return true
            #return True

Then in parallel.py I just added one more check for firing the producer for the object. Iteration for the pending loop looks like this now:

    for obj in pending:
        deps = get_deps(obj)

        if any(dep in state.failed for dep in deps):
            log.debug('{} has upstream errors - not processing'.format(obj))
            results.put((obj, None, UpstreamError()))
            state.failed.add(obj)
        elif all(
            dep not in objects or dep in state.finished
            for dep in deps
        ) and all(
            #This is a new case checking for healthy status --> is dep always an instance of Service class?
            dep.is_healthy() for dep in deps
        ):
            log.debug('Starting producer thread for {}'.format(obj))
            t = Thread(target=producer, args=(obj, func, results))
            t.daemon = True
            t.start()
            state.started.add(obj)

Can anyone from the maintainers comment on it? It works with up command, down command also no issue. However, it might break something or?

@earthquakesan
Copy link

tests/acceptance/cli_test.py::CLITestCase::test_down <- tests/integration/testcases.py

This test case hungs =\

@dnephin
Copy link

dnephin commented Sep 6, 2016

I don't think we need to use the healthcheck for volumes_from. It only requires the container to be running.

I think it would be good to only use healthchecks for depends_on, and leave links with the old method (only waiting on the container to start). That way there is a way to add dependencies without incurring the extra time waiting on healthchecks. Some applications may handle dependencies more gracefully, and we shouldn't make their startup time slower.

@mortensteenrasmussen
Copy link

Sounds like a good idea @dnephin

@aanand aanand added this to the 1.9.0 milestone Sep 15, 2016
@shin- shin- modified the milestones: 1.10.0, 1.9.0 Sep 28, 2016
@earthquakesan
Copy link

Is anyone working on this issue?

@Hronom
Copy link

Hronom commented Nov 25, 2016

Any progress on this?

@adriancooney
Copy link

@Hronom looks like a WIP pull request here: #4163

@lucasvc
Copy link

lucasvc commented Apr 20, 2018

This issue should be re-opened, as it is not working:

version: "3"

services:

  rabbitmq:
    image: rabbitmq:management
    ports:
      - "5672:5672"
      - "15672:15672"
    healthcheck:
      test: ["CMD", "rabbitmqctl", "status"]

  queues:
    image: rabbitmq:management
    depends_on:
      - rabbitmq
    command: >
      bash -c "rabbitmqadmin --host rabbitmq declare queue name=my-queue"

When executing:

docker-compose up queues

this reports as if the RabbitMQ server is not started. The container status of the rabbitmq is starting when I look at it.
After a few seconds the container gets healthy.

@earthquakesan
Copy link

@lucasvc the healthchecks will not help you ensure start up order of your system. That is a design decision. System components should be implemented in a way that they can reconnect/retry if something is not up (or died). Otherwise, the system will be prone to cascading failures.

P.S. The issue will not be re-opened, the feature is behaving as expected.

@ags799
Copy link

ags799 commented Apr 27, 2018

@earthquakesan i must be missing something, the OP states

Is it possible for the depends_on parameter to wait for a service to be in a "Healthy" state before starting services, if a healthcheck exists for the container?

what is the feature doing for us now?

@ags799
Copy link

ags799 commented Apr 27, 2018

also, to be clear, we're not using this feature to keep our applications waiting on their dependencies. i have a docker-compose.yml defining an app under test, a database, and a "tester" container that tests the app.

the app under test will handle waiting for and reconnecting to the database, that's no problem. the issue is the "tester" container. i'd like for it to just run go test (i'm using golang) once the app under test is ready. and this healthcheck feature seems well-suited.

it seems unnecessary for a tester container, which is just running go test, to also maintain a healthy connection with the app under test. i wish i could be running these tests from the host (read: not inside a container), but strangeness with our current CI system necessitates putting everything into containers.

hope this makes sense.

@shin-
Copy link

shin- commented Apr 27, 2018

Please refer to the docs on how to declare health check dependencies:
https://docs.docker.com/compose/compose-file/compose-file-v2/#depends_on

@ags799
Copy link

ags799 commented Apr 27, 2018

it's unfortunate they pulled the condition feature from depends_on for 3.0. seems the official recommendation is wait-for.

@earthquakesan
Copy link

earthquakesan commented Apr 28, 2018

@ags799 I've been following the discussions here and the timeline was as follows:

  1. depends_on was waiting for a docker container to start (docker-compose version 1 and 2)
  2. HEALTHCHECKS and "condition" introduced (docker-compose version 2.1). AFAIK, condition is only supported in 2.1.
  3. rollback to 1. (docker-compose version 3)
  4. Deprecate the feature in the last docker swarm (cite the docs: The depends_on option is ignored when deploying a stack in swarm mode with a version 3 Compose file.)

As the most up-to-date (and the best) way to deploy docker-compose is with docker stack deploy command, the feature is basically deprecated.

Here is a better "wait-for-it" version. The code is licensed under MIT, so feel free to reuse it.

@lucasvc
Copy link

lucasvc commented Apr 28, 2018

@shin- about

Please refer to the docs on how to declare health check dependencies:
https://docs.docker.com/compose/compose-file/compose-file-v2/#depends_on

Why I should maintain two healthchekcs, when docker provides me a way to know if the service is healthy?
DRY tell me to use docker feature, but between containers I think this is not doable.

@shin-
Copy link

shin- commented Apr 28, 2018

I'm not sure what you mean ; there's no need to maintain two healthchecks.

@pecigonzalo
Copy link

Yes, the wait-for and the healtcheck. Also wait-for is insufficient for many scenarios, EG: PG db init, we use pg_isready which not only tests the port but also that the DB is there.

@jjulien
Copy link

jjulien commented Aug 13, 2019

Waiting for a container to become ready does seem like it should be native functionality. Including a wait-for-it script seems like an acceptable hack in lieu of the feature, but not a refined solution to the problem.

@matti
Copy link

matti commented Aug 13, 2019

it would need something like kubernetes readinessProbes, but "waitForItProbes" which doesn't really solve the problem, because if container A dies/restarts after signaling "ready" to container B, then container B will anyway crash because A is not available

I suggest closing this issue, because the only way to "do" this, is to have "wait-for-it.sh" or similar setup.

@patsevanton
Copy link

@ags799 I've been following the discussions here and the timeline was as follows:

  1. depends_on was waiting for a docker container to start (docker-compose version 1 and 2)
  2. HEALTHCHECKS and "condition" introduced (docker-compose version 2.1). AFAIK, condition is only supported in 2.1.
  3. rollback to 1. (docker-compose version 3)
  4. Deprecate the feature in the last docker swarm (cite the docs: The depends_on option is ignored when deploying a stack in swarm mode with a version 3 Compose file.)

As the most up-to-date (and the best) way to deploy docker-compose is with docker stack deploy command, the feature is basically deprecated.

Here is a better "wait-for-it" version. The code is licensed under MIT, so feel free to reuse it.

Hello! Where i can read about this secret knowledge? May be issue, blog post, official doc? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests