Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/etc/hosts not consistently populated when using --x-networking #2445

Closed
kenjones-cisco opened this issue Nov 23, 2015 · 9 comments
Closed

Comments

@kenjones-cisco
Copy link

docker-compose version: 1.5.1
docker-py version: 1.5.0
CPython version: 2.7.6
OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014

OS:

Linux entcon-dev 3.13.0-65-generic #105-Ubuntu SMP Mon Sep 21 18:50:58 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

docker-compose.yml

consul1:
  container_name: consul1
  build: .
  command: "agent -join consul3 -config-dir=/config"
  ports:
    - "8301"
    - "8302"
    - "8400"
    - "8500"
    - "8600"

consul2:
  container_name: consul2
  build: .
  command: "agent -join consul3 -config-dir=/config"
  ports:
    - "8301"
    - "8302"
    - "8400"
    - "8500"
    - "8600"

consul3:
  container_name: consul3
  build: .
  command: "agent -bootstrap-expect 3 -config-dir=/config"
  ports:
    - "8301"
    - "8302"
    - "8400"
    - "8500"
    - "8600"

When using the command:

docker-compose --x-networking up -d

On Docker Engine 1.9.0 it successfully starts and connect to each other each time.

On Docker Engine 1.9.1 only consul3 starts successfully, the other two consul1, consul2 both fail because consul3 is unknown. Using docker inspect to identify where the hosts file is located, the file is missing any entries for the other containers.

I had started with progrium/consul before I switched over to my own images, but even if you use that image and pass in all the configurations as part of the command the same behavior can be seen using that image.

@dnephin
Copy link

dnephin commented Nov 23, 2015

This sounds like it might be a docker issue? The docker engine is what is responsible for creating the /etc/hosts entries. If you're able to reproduce the regression, please open an issue on docker/docker.

@kenjones-cisco
Copy link
Author

Thanks for confirming that the /etc/hosts is delegated to docker. I'll open the issue there.

@mavenugo
Copy link

@dnephin @kenjones-cisco have closed the docker issue with the following comment : moby/moby#18174 (comment).
Lets try to figure out the compose logic on how the containers are sequenced (which impacts the /etc/hosts update).

@dnephin
Copy link

dnephin commented Nov 24, 2015

I think I see what's happening now. The entrypoint for the script assumes that the entry already exists in /etc/hosts, but the order is undefined, so that might not be true. If the container stayed running, the /etc/hosts entry would eventually be written. I'm not sure why this worked with docker 1.9, it might have just been a coincidence.

I think the entrypoint script needs to poll and wait for the consul3 instance to be available. Either by retrying the consul command, or maybe using a ping as a healthcheck to see when it's resolvable.

There has been some discussion in #686 about maybe adding support for some ordering with --x-networking, however it's not clear if that is really going to address the issue.

@kenjones-cisco
Copy link
Author

I originally had consul1 as the intended leader but the compose kept starting them up in reverse order (consul3, consul2, consul1), so I made consul3 the expected leader. But on Docker Engine v1.9.0, it works every time without fail but on Docker Engine v1.9.1 consul1 and consul2 will die with the provided error.

Interestingly enough, with consul3 already running, if I ran docker-compose --x-networking up -d a second time to start up consul1 and consul2 again, it would still fail so the step when /etc/hosts is written vs. when the entrypoint/command gets executed is either happening in parallel or out of sequence.

As to the ordering, if nothing else a priority key to identify which ones should be started in a particular order. The links option probably enforced a natural ordering but without that each container would need logic to keep retrying, versus a way to indicate that containerX is a higher priority than containerY such that they are sorted to start using the priority.

@ulope
Copy link

ulope commented Nov 26, 2015

I've had similar issues with the order in which services get started (even without --x-networking) and am using this script in my entrypoints as a workaround for now.

@jjfraney
Copy link

In my case, host names in /etc/hosts show as: container-name and container-name.project-name.

[jfraney@openldap-testvm acrapp]$ sudo docker exec acr10_acrRest01_1 cat /etc/hosts
stuff removed
172.21.0.3 acr10_acrdbHost_1
172.21.0.3 acr10_acrdbHost_1.acr10

I want this entry, simply the service-name or the services's hostname:
172.21.0.3 acrdbHost

When I start containers without compose, individually with docker, I have control on this entry. Why not with compose?

@dnephin
Copy link

dnephin commented Nov 30, 2015

With docker you can only set it to the container name, which is what is happening here as well. See #2312 and moby/libnetwork#737 for the work in progress to add alias support

@dnephin
Copy link

dnephin commented Jan 17, 2016

This sounds like a duplicate of #2614 and #2681 which I've closed. This issue already covers the suggested approach for handling the problem.

The underlying issue is tracked in #374 and #2682, so I'm going to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants