-
Notifications
You must be signed in to change notification settings - Fork 302
fleet can start units out of order on startup #997
Comments
Is there a halfway decent workaround for this? This has been biting us through pretty much every CoreOS update cycle. I've tried playing with systemd and fleet dependencies but there's always at least one service that gets lost on a reboot cycle. |
I'm having this issue as well on initial provisioning of my test Vagrant cluster, Requirements don't get fulfilled because of "No such file or directory" leaving the service that requires it in inactive state. |
@balboah |
@bcwaldon Thanks for the hint, that makes a usable work around for my setup. I hope fleet will be production ready by the time I'm migrating my vagrant setup :) |
@balboah that's not a workaround, that's how it's suppose to work =) |
@bluedragonx If To me it seems confusing when |
If you start a set if units it will submit, load, and start them together as necessary. The bug associated with this issue does not manifest itself in these circumstances. You do not need to load and then start if you've written your unit files correctly. |
related - #993 |
Additionally, I've seen this happen after fleet reschedules failed units to different machines in the cluster. Which basically means it can randomly stop working after a successful deploy. |
👍 for a fix for this. Just a note: if a service is using "Wants" directive and if we follow the steps described in : |
Upon auto-updating from 522.6.0 to 557.2.0 we're hit by the same problem as described by @sukrit007. We're using Wants too and the only work-around seems to be starting the sidekick unit before the worker unit. Since sidekick units are binding services, this is pretty bad. |
We're getting hit with this too. Our current workaround is to pepper our scripts with calls to |
Interesting that 557.2.0 just got moved to stable with this being reported On Sun, Feb 22, 2015 at 10:38 AM, Ryan Tanner notifications@github.com
|
I'm getting this issue too - I experience it every morning when I open my laptop and the database running in my Vagrant cluster has died - the only way to get it started is fleetctl destroy / fleetctl start If it helps, the database has got dependancies like this (I've omitted all the other elements of the unit for clarity)
The two sidekick units are similar, and have dependancies like this
|
#944 - our hero? Only one question: how to specify |
@bcwaldon Thanks for fixing - this is awesome. Do you know if there will be a patch into the stable branch? |
Please see #1134 |
just ran into this issue again with 607.0 fleet 0.9.1 . Though the probability of occurrence has reduced after the fix, but happened once after 10 deploys.
Note: This did not happen on startup , but while deploying the services using the sidekick pattern with "Wants" directive as described in :
|
@sukrit007 I'm surprised you're running into it more often, to be honest. Your master unit has a dependency on your sidekick unit, but fleet will not schedule your sidekick unit out to the cluster until the master has already been scheduled. Remove the Wants= directive from your master and the problem will be resolved. The reason you see it so infrequently now is likely due to the fact that you start the units at the same time and the fleet engine is slow enough that both units are available for reconciliation (and therefore scheduling) together. |
@bcwaldon I think I would still need Wants in order to ensure that Sidekick restarts, when my main unit restarts. Even though I have Restart=always, my sidekick won't restart itself (when main unit restarts) if I do not have "Wants". (Ideally , I would have used "Requires" instead of "Wants" , but due to bug in systemd (#1089) , can not use that either. I will keep monitoring this, but any suggestion to workaround the issue is appreciated. |
@sukrit007 I see two short-term paths forward for you:
|
@bcwaldon Thx for the same.... I will move to option 2 , if I start seeing this more frequent. But would love to stick with Sidekick with inter-unit dependency for future to isolate the concerns in 2 separate unit files. |
Same here, I thought that "Cannot add dependency job for unit discovery ..." were resolved on fleet 0.9.1, it happens to me every time i launch the side kick.. |
@umiller if you cannot use either of the options above, you can roll back the version of fleet by placing a fleetd binary at /opt/bin/fleetd and overriding the builtin fleet unit by copying /usr/lib64/systemd/system/fleet.service to /etc/systemd/system/fleet.service and modifying the ExecStart line to point to /opt/bin/fleetd. |
fleetctl start
the following unitssystemctl kill -s SIGKILL fleet && systemctl start fleet
There is a chance that fleet will load and attempt to start
bar.service
before loadingfoo.service
, causing the following error:This bug was originally reported in #974.
The text was updated successfully, but these errors were encountered: