Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dependencies to be specified explicitly with depends_on #686

Closed
wants to merge 17 commits into from
Closed

Allow dependencies to be specified explicitly with depends_on #686

wants to merge 17 commits into from

Conversation

gilclark
Copy link

@gilclark gilclark commented Dec 1, 2014

Currently services are started in the order defined by links between services. We also need services that will share volumes to be started in the right order. And the net: container:foo statement should be recognized as a project service to be started in the right order and with the proper container name being passed to the using services. Finally, dependencies need to be explicitly called out even if the services don't use links, volumes_from, net, etc. from each other so that if ordering of service start up is needed it happens correctly.

For example, with this fig.yml file, we will be guaranteed that the data container starts before the db container which starts before the app container which starts before the web container. And the network container will start before all the rest and it's container id will be properly passed to the using containers. This allows the "kubernetes pod" type system to be set up easily which was hard to set up with fig before.

web:
  image: webimage
  net: "container:network"
  depends_on:
    - app
app:
  image: appimage
  net: "container:network"
  depends_on:
    - db
db:
  image: dbimage
  volumes_from:
    - data
  ports:
    - 1529:1529
data:
  image: dataimage
  volumes: /foo

The internals of dependency management have been modified quite a bit with most changes to service.py and project.py. The unit and integration tests have been modified to check that it all works.

def links(self):
links = []
def dependencies(self):
dependencies = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code is only used in tests, if you're changing it, I'd love to see it removed from here and moved into a test module.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I don't see it used anywhere. I removed it and re-ran the tests.

@dnephin
Copy link

dnephin commented Dec 2, 2014

I noticed some commented out test code and a version change, so I assume this is not ready yet, is that correct?

@dnephin
Copy link

dnephin commented Dec 2, 2014

Thanks for the contribution!

This is a big change, I think it might make sense to split this up into two PRs. One for the change to allow for net: "container:network" and another for the new depends_on key.

for service_dict in sort_service_dicts(service_dicts):
links = project.get_links(service_dict)
volumes_from = project.get_volumes_from(service_dict)
temp_dicts = copy.deepcopy(service_dicts)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never deepcopy() there's always a better way. I think this can be done without mutating the original dicts so we don't need a copy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the deep copy, the goal is to not allow a service to be created with volumes, links, etc. passed to it because it won't understand the dependencies - only the project knows that. It would be easy to allow those keys to be passed to the Service instantiation where they would be ignored and let the project do it's thing later. That would avoid needing the temp dictionary. But I was trying to be more prescriptive with the service and not allow those dependency related keys to be passed to it which necessitated removing them from the dictionary. I'm not a python expert so any guidance would be appreciated and I can update my fork.

Regarding separating the net container and depends_on functionality - it can be done but they both rely on the dependency management refactoring. So removing one or the other doesn't buy that much. It's all one big related change.

The build dependency issues which you referred to are fixed by adding the "depends_on" key. If two builds have to happen in order the "depends_on" key will get the job done with a fig up. Fig build won't recognize the dependencies at this point although that could be added using the dependency ordering.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's just a lot of code changes here, so it's going to be hard to get it all reviewed and actually merged.

I think the change to the sort services looks good, but the "depends_on" is a more controversial change, and might take longer to get merged.

I'm still looking it over, so maybe there's a better way to break it up.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. It is definitely a lot of change. Of of my goals was not to change the api (syntax or semantics) at all and I think that's the case except for the "depends_on" which is new. It could be called whatever you like. The goal of that is to force the ordering if it's not otherwise forced. It does work including the case where one build needs to be used by another in order. I added a bunch of test cases to the sort... unit tests to test the ordering.

@dnephin
Copy link

dnephin commented Dec 2, 2014

Issues #295 and #235 are related

volumes_from=[mock.Mock(id=container_id, spec=Container)])

self.assertEqual(service._get_volumes_from(), [container_id])

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests really shouldn't be deleted, the code still exists.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since a Service can't be created with volumes anymore (only the Project knows the dependencies of a Service and those are added to the Service by the Project during dependency resolution - including existing containers) I moved those tests to the project unit tests. I did add a variant for the service already having the container and I'll push that today.

@STRML
Copy link

STRML commented Dec 6, 2014

I am a big fan of this, 👍 from me. I was just having this problem with postgres - if using, for example, fig up -d db, the data container is created, but because it is not a proper dependency, the volume links are not created which can lead to a very nasty surprise if you kill the db container.

For example, with this configuration:

db:
  ports:
   - "5432:5432"
  volumes_from: 
   - dbdata
dbdata:
  command: echo "Data-only container for postgres"
  image: postgres
  volumes:
    - /etc/postgresql
    - /var/log/postgresql 
    - /var/lib/postgresql
    - /var/lib/postgresql/data

This works properly with a simple fig up - I see the data container created automatically before the dependent container. But if it's part of a larger configuration, and I fig up db, the data container is not created at the correct time, but it is still created! This means that everything looks fine in fig ps, but the volumes are not actually attached.

You can get around this by adding a link from db to dbdata, but then initialization sometimes fails. I got around that by adding a sleep to the command for dbdata... this feels hacky.

@dnephin
Copy link

dnephin commented Dec 6, 2014

@STRML I thought I had fixed that issue with #456. fig already considers volumes_from for dependency ordering https://github.com/docker/fig/blob/master/fig/project.py#L30, so I think the issue you're having is unrelated to this.

@STRML
Copy link

STRML commented Dec 6, 2014

@dnephin You're right, turns out my homebrew had screwed the pooch and linked to fig 1.0.0 where this was still broken. So sorry to derail the conversation.

@gilclark
Copy link
Author

gilclark commented Dec 6, 2014

I have reworked my fork so that the changes are simpler and in line with @dnephin suggestions. It's more of an addition than a deep change.

The depends_on is still there but much simpler as it only affects the topo ordering. volumes_from and net container dependencies work too. I did remove the code from _get_volumes_from that creates net containers if they con't exist because those service are now guaranteed to be already running because of the ordering (the containers will exist). There are many new integration tests and I was able to leave all of the service unit tests intact except one (the one that checks for _get_volumes_from creating containers) with the simpler set of changes.

@dnephin, please check out the new tip of my fork and see what you think.

Thanks

dep_names = service.get_linked_names() + \
service.get_volumes_from_names() + \
service.get_net_names() + \
service.get_depends_on_names()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably fail flake8 checks. pep8 suggests using parenthesis around the statements instead of \.

ex:

dep_names = (service.get_linked_names() +
    service.get_volumes_from_names() +
    ...
)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake8 was fine actually but I changed it to use parens.

@dnephin
Copy link

dnephin commented Dec 6, 2014

This looks great! Definitely on the right track

@@ -1,4 +1,4 @@
from __future__ import unicode_literals
from .service import Service # noqa:flake8

__version__ = '1.0.1'
__version__ = '1.0.1-wdeps'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version will get bumped when a release happens, please revert this version bump.

@dnephin
Copy link

dnephin commented Dec 10, 2014

@gilclark cool, just a couple small things I noticed after another read.

You'll still need to squash these git commits into a few logical ones. Since this is such a large change, I think 2 or 3 commits are probably warranted. You can do that pretty easily by doing:

git reset <sha of the last commit before yours>
git add -p # lets you pick chunks one at a time

Let's loop in the other maintainers and see what they think

@dnephin
Copy link

dnephin commented Dec 10, 2014

cc @bfirsh , @aanand

Hey guys, this is a rather large change, but I think there is some good stuff in here. To summarize the changes:

  1. Support net:<service name> and net:<container_name> (I believe this would resolve Circular import between container1 and container2 #666, and generally would be a nice thing to have when trying to create a fig setup for something like zookeeper)
  2. Adds a new config key depends_on to force an arbitrary ordering of service (this has been requested a few times, and would fix related issues: Order of containers starting up? #235, request: add build priority/order #295)
  3. Consider both of the above for the purposes of sorting services
  4. Start all dependencies on fig run and fig up (instead of just links). All dependencies are now: links, volumes_from, net, and depends_on

I'm pretty happy with (1) and (3).

(2) would satisfy a few requests, but is a new fig-ism, so I'm not sure if that's something you want to introduce right now.

(4) seems like a good idea. It addresses the problem of not having a container available when you try and do a volumes_from or net. However I think it may not be backwards compatible. Anyone using volumes_from right now would have their volume container start if they fig run or fig up with a service name. This seems like a pretty minor change that shouldn't break anything, but who knows.

@gilclark
Copy link
Author

Ok, I've cleaned up the history.

Regarding depends_on this is something my particular use case calls for. I have a web container that must start after the app container but there are otherwise no dependencies so depends_on works well.

The code for depends_on is the simplest of all so in that regard the change is not as much as the rest. The main change for users is that new "fig-ism".

Gil Clark and others added 2 commits December 13, 2014 10:34
service ordering.  Added depends_on to allow arbitrary dependency
specification.

Signed-off-by: Gil Clark <gilclark1@gmail.com>
Signed-off-by: Gil Clark <gilclark1@gmail.com>
@gilclark
Copy link
Author

@dnephin,

Hi,
What are the next steps? I'm looking forward to using the new features in some work I'm doing for my company.

Thanks

@dnephin
Copy link

dnephin commented Dec 16, 2014

@gilclark we need another "lgtm" from another maintainer.

cc @bfirsh and @aanand for comment

@gilclark
Copy link
Author

Ok, thanks.

@aanand
Copy link

aanand commented Dec 17, 2014

Nice.

Regarding (2): I'm skeptical of introducing any configuration option that isn't a Docker idiom, so 👎 on depends_on unless a really compelling use case can be described for it.

Regarding (4): Agreed. It definitely feels like a bug that links is respected when pulling in dependencies but volume_from isn't. Seems unlikely that changing that will break anything.

@dnephin
Copy link

dnephin commented May 6, 2015

I'm not entirely sure what's going on from that gist, but I don't think you can necessarily rely on those IPs being consistent. You would need to use some other form of service discovery like one of the many docker dns options which would let you retry until the services start.

@stephanbuys
Copy link

@dnephin it needs a little bit of Weave background, the meat of the matter is in the test shell script. Basically it creates a small (weave-based) network between the two containers and statically assigns two IP addresses to the containers involved. But the issue (and why I'm commenting on this thread) is that you can only run the "attach" command (which adds the static IPs to the containers) after the said containers are started. "depends_on" would solve that.

That said, service discovery is a valid strategy, and building more smarts into the weave "attach" command is also an option, so valid point.

@pikeas
Copy link

pikeas commented Jun 18, 2015

Bump, given that depends_on was popped off to a different issue, why hasn't this been merged?

@gilclark
Copy link
Author

depends_on doesn't match one of the current docker commands so the use of it has been rejected by the compose curators at this time. I still have the code and unit tests to submit to a pull request but only if I know there's a good chance it will get accepted.

@pikeas
Copy link

pikeas commented Jun 18, 2015

@gilclark What about the part of the PR that treats volumes_from as a dependency?

@gilclark
Copy link
Author

Yes, that was merged in along with the net container being treated as a dependency.

@gtmtech
Copy link

gtmtech commented Jul 30, 2015

+1 for depends_on.

links don't work for --net=host (see moby/moby#7066) , and yes docker doesn't have depends_on as a core command, (maybe it should), but then docker doesn't manage multiple containers by itself (swarm aside), and the whole point of docker-compose is to manage multiple containers, so IMHO it would be better to include this rather fundamental feature - (especially if it got coded up over 6 months ago) and port it over to use docker "depends_on" if it ever gets provided there at that time. It seems to be a vital missing piece of functionality.

@aanand
Copy link

aanand commented Jul 31, 2015

@gtmtech The place to discuss this is #374 - there's a lot of discussion in there. The fundamental problem is more complex than starting containers in a particular order.

@dnephin
Copy link

dnephin commented Sep 4, 2015

Part of this was pushed in #1076 which doesn't seem to be linked from here for some reason, so adding the link now.

@dnephin dnephin changed the title Allow volumes_from, network using containers (pre-existing or in fig project), and links to be treated as dependencies. Also, allow dependencies to be specified explicitly. Allow dependencies to be specified explicitly with depends_on Sep 14, 2015
@dnephin
Copy link

dnephin commented Sep 14, 2015

With the new networking coming in (and deprecating links), it might be worth considering a depends_on option. It still seems to me like more of a "nice-to-have" than a critical feature, but it's worth some discussion I think.

@gilclark
Copy link
Author

I had a fork with depends_on a while back. It's not hard to do since it uses the dependency logic of links and volumes. Would it be better to add it in before or after links are removed.

@delfuego
Copy link

It's hard to tell which issue these days is the "right" place to vote for a new docker compose functionality that allows explicit dependencies to be declared, but consider my vote a strong one. With the new Docker 1.9 networking functionality, and looming deprecation of container links in favor of it, there is now no great way to make sure that container A starts up before container B — because if you use Docker 1.9 user-defined networking, you can no longer specify container links. That's... broken.

@mbdas
Copy link

mbdas commented Nov 18, 2015

I think issue 374 is the right place for it. Can you post it there? I was about to post this today myself.

Sent from my iPhone

On Nov 18, 2015, at 6:22 AM, Jason Levine notifications@github.com wrote:

It's hard to tell which issue these days is the "right" place to vote for a new docker compose functionality that allows explicit dependencies to be declared, but consider my vote a strong one. With the new Docker 1.9 networking functionality, and looming deprecation of container links in favor of it, there is now no great way to make sure that container A starts up before container B — because if you use Docker 1.9 user-defined networking, you can no longer specify container links. That's... broken.


Reply to this email directly or view it on GitHub.

@delfuego
Copy link

@mbdas Done.

@kevstigneev
Copy link

Deprecation of links without a replacement boils Compose value down to a basic shell script. The major power of Compose is being a "Make" for containers, so

docker-compose up <service>

assures that both the and the other services it needs are started. docker-compose.yml used to be a reasonable manifest for a composite application that is executable and is easy to translate to the other representations (e.g. Marathon). While

docker-compose --x-networking up <service>

starts just the itself, so I have either to list its dependencies in the command line or have a separate Compose file for each service.

Networking effect of links is difficult to underestimate ether as it provides static "host names" for service containers regardless scale or COMPOSE_PROJECT_NAME. But that's off-topic.

@wr0ngway
Copy link

wr0ngway commented Dec 5, 2015

+1

@dnephin
Copy link

dnephin commented Jan 17, 2016

Thanks for starting the discussion about this. I've opened #2682 so we can continue to discuss the idea.

A lot has changed since this PR was opened, and I believe we applied some of these changes to rename link to deps in a separate branch. Since there's going to be a lot of work to rebase this, and have it only apply to the v2 config, it may be easier to start from scratch.

I'm going to close this PR for now until we've sorted out the details in #2682.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet