Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(docker-compose): make startup more robust with deterministic services' dependencies #7880

Conversation

gcernier-semarchy
Copy link
Contributor

@gcernier-semarchy gcernier-semarchy commented Apr 21, 2023

Purpose

Several unstable behaviours of the startup through docker compose were noticed ; often requiring to down and up again, after a first unsuccesful up, whether it was just after containers' creation or restart of Docker / host machine.
This PR brings several improvements in corresponding docker-compose*.yml files (esp. /quickstart ones but then more extensively, trying to be consistent as much as possible), especially by:

  • using more services' healthcheck definitions (in /quickstart ones, only elasticsearch had one)
  • making dependency of service-A on service-B more robust by relying on those healthcheck・s, thanks to the depends_on's long syntax
      service-A:
        ...
        depends_on:
          service-B:
            condition: service_healthy | service_completed_successfully
    to ensure that service-A is started not only after service-B is started (which is what short syntax only ensures), but also after service-B is a:
    • either service_healthy which means that its initialization (which can takes some time) completed, it's now running fine and therefore is actually ready to serve requests.
    • or service_completed_successfully, which is particularly useful in the DataHub's context of *-setup services/containers
  • introducing many more such dependencies compared to what existed before.

Which somehow leads to such a dependency graph, e.g. with one of full "quickstart" examples:

                 datahub-frontend-react   datahub-actions
                                    \     /
                                datahub-gms (healthy)
                                       |
                                datahub-upgrade (completed)
            /--------------------/   |   \   \------------------------------------------------\
           /                         |    \-------------------\                                \
mysql-setup (completed)  elasticsearch-setup (completed)  kafka-setup (completed)  (if apply) neo4j (healthy)
    |                           |                          /         \
  mysql (healthy)         elasticsearch (healthy)   broker (healthy)  schema-registry (healthy)
                                                      |
                                                  zookeeper (healthy)

Hence the re-ordering in the docker-compose*.yml files to somehow reflect this dependencies' ordering.

This also enables to remove several restart: on-failure occurrences, that were poor workarounds compared to such more robust and deterministic solution (only kept the occurrences for mysql/mariadb that appears to sometimes still fail just once at the very beginning when trying to create a socket, but not due to docker-compose*.yml structure).

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the devops PR or Issue related to DataHub backend & deployment label Apr 21, 2023
@pedro93 pedro93 added the platform PR-s that make changes to core parts of the platform label Apr 21, 2023
@david-leifker
Copy link
Collaborator

@gcernier-semarchy - This is really nice!!! When did these features get added to docker-compose? I recall looking for a way to do this and not finding it or something along the lines of it was ignored by docker-compose cli.

@gcernier-semarchy
Copy link
Contributor Author

Yes @david-leifker, those healthchecks-based dependency mechanisms are very helpful!

I'm not sure of understanding exactly the whole story, but it seems that there were some back and forths, because:

  • this feature was supported in docker-compose 1.10.0 (beginning of 2017) with version 2.1+ of Compose files ⤵️
  • and, at the same time, it was said that « this feature will not be ported to version 3 Compose files » (in terms of syntax, then)

image

This highlights in particular the mix of 2 lifecycles involved there, which complexifies understanding:

  • the one of Compose Specification standard (cf. website, compose-spec/compose-spec repo) defining format/syntax of corresponding .yml Compose files (latest one is 3.9)
  • the one of Docker Compose (cf. docker/compose repo) (latest one is 2.17.2) considered as the Reference Implementation of the aforementioned Compose Specification

And then, it seems like:

  • the syntax was first re-introduced around May 2020 in the version 3 of Compose Specification standard after some debates in compose-spec/compose-spec's issue #68 and PR #72
  • then, around September 2020, the feature was correctly supported by docker/compose's release 1.27.0 (because supposed to support « merge of 2.x and 3.x Compose formats »)
    But there might be some period of instability of this feature usage, whether in terms of syntax in Compose files or actual support in Docker Compose implementation, depending on used versions for each of them.

In the meantime, when it was not supported, it seems like (even if it disappeared from the documentation history) Docker Compose suggested to perform such checks in application code or thanks to a 3rd-party tools like wait-for-it:
image

Anyway, the good news is that this syntax and this feature are now natively supported and since quite some time and versions of both Compose Specification and Docker Compose! 🙂

@gcernier-semarchy gcernier-semarchy force-pushed the fix/make_docker-compose_startup_more_robust_with_deterministic_services_dependencies branch from dfdf3ea to 782f8e3 Compare April 21, 2023 22:31
@gcernier-semarchy gcernier-semarchy force-pushed the fix/make_docker-compose_startup_more_robust_with_deterministic_services_dependencies branch from 782f8e3 to 7fe924b Compare April 25, 2023 14:27
@gcernier-semarchy gcernier-semarchy force-pushed the fix/make_docker-compose_startup_more_robust_with_deterministic_services_dependencies branch from 7fe924b to a24e095 Compare April 26, 2023 11:48
@david-leifker
Copy link
Collaborator

I really like this PR, the blocker that I need to address here is that the quickstart compose files are derived from other compose files in the parent directory. I need to actually reproduce the changes there so that the automated generation of the quickstart versions are in alignment. I believe it'll be next week before I can get bandwidth to make that change here.

@gcernier-semarchy
Copy link
Contributor Author

I really like this PR, the blocker that I need to address here is that the quickstart compose files are derived from other compose files in the parent directory. I need to actually reproduce the changes there so that the automated generation of the quickstart versions are in alignment. I believe it'll be next week before I can get bandwidth to make that change here.

Ah yes @david-leifker! Now that you point me to this, I get what is missing in my PR.
I might be able to update it in the coming days, so that the alignment you mention will be fulfilled. But if I'm not able before next week, it would be obviously nice if you can.

@gcernier-semarchy gcernier-semarchy force-pushed the fix/make_docker-compose_startup_more_robust_with_deterministic_services_dependencies branch from a24e095 to 4211358 Compare May 8, 2023 18:15
@gcernier-semarchy
Copy link
Contributor Author

Hello @david-leifker,

I finally took some time to update the PR by taking into account your previous remark, but also rebasing from very latest changes.

I also realized that the automated generation (of docker/quickstart/*.yml files) that you mentioned, creates YAML items in alphabetical order, which is not in line with Docker-Compose common practices/structure and therefore not ideal for people wanting to use those quickstart files as is.
But no problem, I pushed them as they are generated.

But as you might have seen, in the docker/*.yml source files used to generate (by overriding mechanism) those quickstart files, I tried to ensure an ordering that reflects the dependency graph I described previously above.

Feel free to review again, so that it can go further. Thanks in advance 🙂

@gcernier-semarchy gcernier-semarchy force-pushed the fix/make_docker-compose_startup_more_robust_with_deterministic_services_dependencies branch from 4211358 to 869ea70 Compare May 8, 2023 18:25
@david-leifker
Copy link
Collaborator

@gcernier-semarchy - Awesome!!! 🤩 Thank you so much for this! I am open to removing the alpha sorting, but not a requirement for getting this in. I am not sure why it was originally done that way, maybe the comparison logic or something. In any case, thank you for taking the time to update this!

@david-leifker david-leifker merged commit ef1ada1 into datahub-project:master May 8, 2023
30 checks passed
@gcernier-semarchy gcernier-semarchy deleted the fix/make_docker-compose_startup_more_robust_with_deterministic_services_dependencies branch May 9, 2023 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment platform PR-s that make changes to core parts of the platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants