Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker compose using depends_on can lead to duplicate graph traversals #9014

Closed
rogerhu opened this issue Dec 8, 2021 · 5 comments · Fixed by #9878
Closed

Docker compose using depends_on can lead to duplicate graph traversals #9014

rogerhu opened this issue Dec 8, 2021 · 5 comments · Fixed by #9878

Comments

@rogerhu
Copy link

rogerhu commented Dec 8, 2021

Description

docker-compose up fails because of creating a duplicate container -- we're not specifying replicas (and not setting the value), but it seems like it's trying to create the same container twice.

 kochiku_redis:
    image: redis:4.0.10
 kochiku_mysql:
    image: mysql57:percona-5.7.30-centos
    hostname: worker-10-132-64-151.ec2
 kochiku_build:
    depends_on:
    - kochiku_mysql
    - kochiku_redis

Can yield:

Container initial_worker-kochiku_mysql-1  Creating
Container initial_worker-kochiku_redis-1  Creating
Container initial_worker-kochiku_mysql-1  Created
Container initial_worker-kochiku_redis-1  Created
Container initial_worker-kochiku_build-1  Creating
Container initial_worker-kochiku_build-1  Creating
Container initial_worker-kochiku_build-1  Created
Error response from daemon: Conflict. The container name "/initial_worker-kochiku_build-1" is already in use by container "bd810aa12bd174ee0c47e0b35c75ae95ade367c89b45b45c632f1c400e3e426c". You have to remove (or rename) that container to be able to reuse that name.

Upon further investigation, the graph traversal algorithm may have a race condition. With a common parent, this can trigger multiple traversals to the parent and cause Docker creation to fail. Basically the algorithm appears to be:

  1. Create the dependency graph. Attach parent/child relationships.
  2. Find all the leaf nodes (kochiku_mysql, kochiku_redis) and start their respective containers up.
  3. Traverse up to parents and start its containers up.

The problem seems to be that both kochiku_mysql and kochiku_redis pointing to the same parent, so we launch multiple Goroutines to start them up. It seems as if filterAdjacentByStatusFn and updateStatus is the source of the race condition. If both calls to updateStatus completes before the filterAdjacentByStatusFn routine, we will see the parent being visited twice. I think there needs to be code added to check for already visited parent nodes.

Steps to reproduce the issue:

What I did

  • Running this revised test in Docker compose bug for common parents #9013, I put stepped through the code section between lines 93 and 102:
    for _, node := range nodes {
    // Don't start this service yet if all of its children have
    // not been started yet.
    if len(traversalConfig.filterAdjacentByStatusFn(graph, node.Service, traversalConfig.adjacentServiceStatusToSkip)) != 0 {
    continue
    }
    node := node
    eg.Go(func() error {
    err := fn(ctx, node.Service)
    if err != nil {
    return err
    }
    graph.UpdateStatus(node.Service, traversalConfig.targetServiceStatus)
    return run(ctx, graph, eg, traversalConfig.adjacentNodesFn(node), traversalConfig, fn)
    . I particularly made sure the Goroutine in line 102 got fired twice to force this race condition.

Related issue

Describe the results you received:

Every so often, I'd see:

kochiku_redis
kochiku_mysql
kochiku_build
kochiku_build

Describe the results you expected:

Watch every so often the output vary between GOOD:

kochiku_redis
kochiku_mysql
kochiku_build

Additional information you deem important (e.g. issue happens only occasionally):

$ docker-compose version
Docker Compose version v2.0.1

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.3-docker)
  scan: Docker Scan (Docker Inc., v0.9.0)

Additional environment details:

Repro'd locally on a MacOS with IntellIJ and the latest master. Problem shows up on latest Linux Docker versions.

@rogerhu rogerhu changed the title Docker compose with a common parent dependency can lead to duplicate graph traverals Docker compose with a common parent dependency can lead to duplicate graph traversals Dec 8, 2021
@rogerhu rogerhu changed the title Docker compose with a common parent dependency can lead to duplicate graph traversals Docker compose with multiple children can lead to duplicate graph traversals Dec 8, 2021
@rogerhu rogerhu changed the title Docker compose with multiple children can lead to duplicate graph traversals Docker compose using depends_on can lead to duplicate graph traversals Dec 8, 2021
@stale
Copy link

stale bot commented Jun 12, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 12, 2022
@tudor
Copy link

tudor commented Jun 17, 2022

Can this be un-staled? :) We're seeing this too in 2.6.0.

@stale
Copy link

stale bot commented Jun 17, 2022

This issue has been automatically marked as not stale anymore due to the recent activity.

@stale stale bot removed the stale label Jun 17, 2022
@peterhoneder
Copy link

We also see this in the latest stable version.

@Inok
Copy link

Inok commented Jul 12, 2022

We also have this issue with Compose 2.6.0. In our CI pipeline, it happens quite often, at approximately 5-10% of runs. Considering that we have 10 configurations to run for every commit, the pipeline for almost every commit finishes with such failure.

Is there any chance that the issue will be investigated?
@ndeloof, sorry for disturbing you, but it seems that the issue is lost in the backlog.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants