Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce up --wait condition #8777

Merged
merged 1 commit into from
Nov 3, 2021
Merged

introduce up --wait condition #8777

merged 1 commit into from
Nov 3, 2021

Conversation

ndeloof
Copy link
Contributor

@ndeloof ndeloof commented Oct 11, 2021

What I did
introduced compose up --wait to run detached BUT wait for services to reach state running or healthy (for those with a Healthcheck defined).

Typical usage is to have some backend services (let's say postgres database) ran by compose, and user need to wait for service to be actually up so he can apply database migrations and start an happy coding day.

services:
  db:
    image: postgres:alpine
    ports: ...
    healthcheck:
      test: ['CMD', 'pg_isready']  
$ docker compose up -d --wait
(...)
$ rake db:migrate

Related issue
#8351

(not mandatory) A picture of a cute animal, if possible in relation with what you did
image

cmd/compose/up.go Outdated Show resolved Hide resolved
cmd/compose/up.go Outdated Show resolved Hide resolved
@thaJeztah
Copy link
Member

Wondering why this should not be the default for -d / --detach; -d would mean that you want the stack/project to be running in the background, which shouldn't be mutually exclusive with "keep the cli running until it reaches that state", similar to how docker service create waits for the service to reconcile its desired state. One thing that may changing in that case is a way to detach from the client (so that deploying the stack can continue).

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 12, 2021

Wondering why this should not be the default for -d

we hardly can change the existing behavior

@thaJeztah
Copy link
Member

So, it currently already waits for things to be created, and started. Are there any cases where the up action to be considered "complete" until services have become healthy? (curious)

Looking at available options; what does ServiceConditionStarted do? (isn't that the same as the default?)

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 12, 2021

"service_started" indeed is the default condition.

Are there any cases where the up action to be considered "complete" until services have become healthy? (curious)

First, service might not define a healthcheck. Other than that, this is just how things used to be, aka "Legacy"

@mik-laj
Copy link

mik-laj commented Oct 13, 2021

Do we need any unit tests, integration tests, E2E tests?

@thaJeztah
Copy link
Member

service might not define a healthcheck

For that case, we should consider "running" to be the equivalent of "healthcheck==ok". Healthcheck during startup is just a more customized "readiness" check. If there's no customized one, then "container running" means it's ready.

Curious; what's the current behavior in this PR if I would pass a --wait on a service that doesn't have a healthcheck?

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 15, 2021

I don't think we should assume healthy == running, better detect container has no healthcheck configuration (by ContainerInspect.Config.Healthcheck) and report an error

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 22, 2021

we actually already return error when service/docker image doesn't define a healthcheck

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 26, 2021

@ulyssessouza any thoughts?

cmd/compose/up.go Show resolved Hide resolved
pkg/compose/start.go Outdated Show resolved Hide resolved
pkg/compose/start.go Outdated Show resolved Hide resolved
cmd/compose/up.go Outdated Show resolved Hide resolved
pkg/api/api.go Outdated Show resolved Hide resolved
pkg/compose/start.go Outdated Show resolved Hide resolved
@thaJeztah
Copy link
Member

I don't think we should assume healthy == running

The point is that the general design for healthchecks was to (during startup) be a readiness check; in that design "no healthcheck" implicitly means "no readiness checck, so running == "healthy"

better detect container has no healthcheck configuration (by ContainerInspect.Config.Healthcheck) and report an error

So, what are the other options?

  • started is the default
  • healthy will error if the service doesn't have a healthcheck
  • completed_successfully ??? not sure what this does; wait for the container to exit?

I'm personally still in favor of making this part of the standard behavior; we'e already printing the steps taken to bring up the stack, so it would be one more output to that list

  • creating network
  • creating volumes
  • start services
  • wait for services to be running
  • (*) waiting for services to be healthy

The UX would be more logical for docker compose up to stay attached until the project is deployed (similar to docker service create). After all; the project won't be functional until it's healthy?

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 27, 2021

The point is that the general design for healthchecks was to (during startup) be a readiness check; in that design "no healthcheck" implicitly means "no readiness checck, so running == "healthy"

There's indeed no distinction in docker between "ready" and "healthy", but I think we can assume actual usages based on this. Better reject a --wait when no healthcheck is set than pretending to be clever.

So, what are the other options?
started is the default

Then this flag is not needed. compose up -d already exit after all containers have been started

healthy will error if the service doesn't have a healthcheck

this is what's implemented here

completed_successfully ??? not sure what this does; wait for the container to exit?

yep. Basically, init container.

I'm personally still in favor of making this part of the standard behavio

This is all about backward compatibilty, can't just change the behavior for a command used by thousands (millions?) because is looks niceer

@ndeloof ndeloof force-pushed the up_wait branch 2 times, most recently from 4bc573f to 5df5f77 Compare October 27, 2021 07:54
@thaJeztah
Copy link
Member

we can assume actual usages based on this.

I'm not sure what you mean with this "assume actual usages"

than pretending to be clever.
..
yep. Basically, init container.

For both cases, the --wait would be for the service to reach/reconcile with it's expected state, which could be "running" (no healthcheck), "healthy" (with healthcheck), or "exited succesfully" (init containers)

This is all about backward compatibilty, can't just change the behavior for a command used by thousands (millions?) because is looks niceer

I think we're waaaay past "backward compatiblity" when we did #8655. We could still use the --compatibility flag for that if it's a real concern.

because is looks niceer

It's not "looking nicer"; I think the current behavior could be considered a bug; the expectation of a compose up would be to deploy the project (as described in the compose file), and detach once it succesfully did so. not waiting for that, means the project deployment didn't finish.

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 29, 2021

#8655 changes container names, but doesn't prevent most users to compose up and keep running the same commands they're used to
changing the behavior for up to wait for healthcheck status would make many user suddenly get up to get stuck for minutes without any extra logs. I can't imagine the number of issues we would receive about this.

@ndeloof
Copy link
Contributor Author

ndeloof commented Oct 30, 2021

seems we hardly can find a consensus here.

Here is my last proposal, reducing the scope:
Introduce --wait flag to wait for all services to reach either the running state or healthy for those who define a healthcheck test (either by compose file or image)

@ndeloof
Copy link
Contributor Author

ndeloof commented Nov 3, 2021

implemented the proposed "simpler" --wait boolean flag. Please reconsider

@ndeloof ndeloof force-pushed the up_wait branch 2 times, most recently from 09976f7 to 61d63cd Compare November 3, 2021 08:25
Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
@thaJeztah
Copy link
Member

Here is my last proposal, reducing the scope:
Introduce --wait flag to wait for all services to reach either the running state or healthy for those who define a healthcheck test (either by compose file or image)

Looks like I missed your comment. Yes, I think that's a good middle-ground for now.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dep, config := dep, config
eg.Go(func() error {
ticker := time.NewTicker(500 * time.Millisecond)
defer ticker.Stop()
for {
<-ticker.C
switch config.Condition {
case ServiceConditionRuningOrHealthy:
healthy, err := s.isServiceHealthy(ctx, project, dep, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just thinking out loud); perhaps we need to have a look at the events API, and see if we can improve it enough so that compose can (more easily) use that. We already have a health_status event, but perhaps that alone is not sufficient, but we could look at enhancing it. (that way compose wouldn't have to poll docker inspect calls)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would ne a nice addition to the event stream indeed

@ndeloof ndeloof merged commit 72e4519 into docker:v2 Nov 3, 2021
@ndeloof ndeloof deleted the up_wait branch November 3, 2021 17:22
@mitasov-ra
Copy link

Sorry, but is there any timeout for waiting?

I've set my health check to "exit 1" with 5 retries every 1 second, but

docker compose up -d --wait

waits for several minutes already and seems like it will go forever.

I've thought it should fail if any of services is unhealthy, isn't it?

@benesch benesch mentioned this pull request Jan 7, 2022
benesch added a commit to benesch/compose that referenced this pull request Jan 7, 2022
The previous code would wait for dependencies to become healthy forever,
even if they'd become unhealthy in the meantime. I can't find an issue
report for this bug, but it was described in a comment on the PR that
introduced the `--wait` flag [0].

[0]: docker#8777 (comment)
benesch added a commit to benesch/compose that referenced this pull request Jan 7, 2022
The previous code would wait for dependencies to become healthy forever,
even if they'd become unhealthy in the meantime. I can't find an issue
report for this bug, but it was described in a comment on the PR that
introduced the `--wait` flag [0].

[0]: docker#8777 (comment)

Signed-off-by: Nikhil Benesch <nikhil.benesch@gmail.com>
ndeloof pushed a commit that referenced this pull request Jan 7, 2022
The previous code would wait for dependencies to become healthy forever,
even if they'd become unhealthy in the meantime. I can't find an issue
report for this bug, but it was described in a comment on the PR that
introduced the `--wait` flag [0].

[0]: #8777 (comment)

Signed-off-by: Nikhil Benesch <nikhil.benesch@gmail.com>
ulyssessouza pushed a commit that referenced this pull request Mar 28, 2022
The previous code would wait for dependencies to become healthy forever,
even if they'd become unhealthy in the meantime. I can't find an issue
report for this bug, but it was described in a comment on the PR that
introduced the `--wait` flag [0].

[0]: #8777 (comment)

Signed-off-by: Nikhil Benesch <nikhil.benesch@gmail.com>
jpmckinney added a commit to open-contracting/spoonbill-web that referenced this pull request Apr 5, 2022
…r.yml.

Checks
- Add Django checks
- Add PYTHONWARNINGS check
- Remove diff-cover check

Move build step to Docker workflow
- Remove permissions section
- Remove CI=True POSTGRES_HOST=127.0.0.1 DOMAIN_URL=http://127.0.0.1/api environment variables
- Use --wait instead of sleep 60 docker/compose#8777
- Use default values for API_PREFIX, POSTGRES_DB, POSTGRES_USER, JOB_FILES_TIMEOUT
- Move if-statement for SKIP_TEST

Refactor test workflow
- Move Transifex upload to new job
- Remove coverage thresholds (defer to coveralls)
- Remove CI=True DEBUG=True environment variables
- Remove fetch-depth: 0
- Use default values for API_PREFIX, POSTGRES_DB, POSTGRES_USER
- Change service ports
- Change PostgreSQL credentials

Remove pytest configuration from setup.cfg
- Add --cov spoonbill_web
- Allow auto-discovery of DJANGO_SETTINGS_MODULE
- Use default for python_files (test_*.py) and testpaths (all)
- Remove norecursedirs = .git

Other changes
- Delete .envrc
- Remove ALLOWED_HOSTS from docker-compose.test.yaml
- Change settings defaults to require fewer overrides:

  - API_PREFIX
  - CELERY_BACKEND
  - CELERY_BROKER
  - DB_HOST
  - POSTGRES_DB
  - POSTGRES_PASSWORD
  - POSTGRES_USER
debdutdeb pushed a commit to debdutdeb/compose that referenced this pull request Jun 30, 2022
The previous code would wait for dependencies to become healthy forever,
even if they'd become unhealthy in the meantime. I can't find an issue
report for this bug, but it was described in a comment on the PR that
introduced the `--wait` flag [0].

[0]: docker#8777 (comment)

Signed-off-by: Nikhil Benesch <nikhil.benesch@gmail.com>
@noorul
Copy link

noorul commented Aug 4, 2022

How long does this wait?

@ndeloof
Copy link
Contributor Author

ndeloof commented Aug 4, 2022

@noorul wait until dependent service reports a healthy state, as configured by HEALTHCHECK

@noorul
Copy link

noorul commented Aug 4, 2022

@ndeloof I think a timeout option will add value here instead of waiting forever.

robertknight added a commit to hypothesis/lms that referenced this pull request Jul 10, 2023
This enables starting lms services and the dev server by running `make services
&& make dev`, without encountering errors due to the DB not being ready.

nb. The `--wait` flag is missing from the docs but see
docker/compose#8777.
robertknight added a commit to hypothesis/cookiecutters that referenced this pull request Jul 10, 2023
This enables starting apps with `make services dev` without potentially running
into errors from the app due to services not being ready when the web server
tries to connect.

Services must define a healthcheck [1] in `docker-compose.yml` for this to work.

The `--wait` flag is missing from the Docker Compose docs, but see
docker/compose#8777.

[1] https://docs.docker.com/compose/compose-file/05-services/#healthcheck
robertknight added a commit to hypothesis/cookiecutters that referenced this pull request Jul 10, 2023
This enables starting apps with `make services dev` without potentially running
into errors from the app due to services not being ready when the web server
tries to connect.

Services must define a healthcheck [1] in `docker-compose.yml` for this to work.

The `--wait` flag is missing from the Docker Compose docs, but see
docker/compose#8777.

[1] https://docs.docker.com/compose/compose-file/05-services/#healthcheck
@nawa
Copy link

nawa commented Aug 4, 2023

@ndeloof Thanks for the useful feature

Do you have a proposal how to avoid the issue with init containers that exit with 0 status as a result and it is totally ok. But --wait expects that all containers described in config are in running status

services:
    db:
        image: mysql:8.0
    
    init-db:
        image: mysql:8.0
        command: ./init-db.sh
        environment:
            <<: *cenv
        volumes:
            - ./init-db.sh:./init-db.sh
        depends_on:
            db:
                condition: service_started
docker compose up -d --wait

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants