Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Failures in parallel tasks do not stop execution #457
The following test case runs
def test(): execute(test_thing) execute(test_thing2) @roles('web_workers') @parallel def test_thing(): run('./test.sh') run('date') @roles('web_workers') @parallel def test_thing2(): run('date')
I'm assuming this isn't the intended workflow, if it is or I'm making a mistake I'd be interested to know what it is.
No, that's not the intended workflow, unless I am mis-remembering the intent (paging @goosemo to aisle 457 -- he might remember more).
What should happen is that
I'll check it out -- thanks for the report.
As it's written, the behavior showing is how it will happen. Return codes are essentially ignored, so that if a host's execution dies, it doesn't knock over the whole task. If there are dependancies to tasks, I've always kept them in a single task, or had the second subordinate task have a check to see if the requisite task did what was needed.
Just chatted with @goosemo on IRC about this, @sdcooke. When he wrote the original implementation, his use case was of the "I'm running on a ton of hosts and don't want one or two failures to sink the entire run" variety.
However, to be consistent with Fabric's original "fail fast" philosophy, the default behavior needs to be what you and I discussed above -- a parallelized task should
We may need an additional setting to allow overriding this (i.e. at a higher level than
Yes I suppose it's two different use cases. In our case if one host fails we'll end up with inconsistent code running between our servers, which we want to avoid.
In the case I posted above test_thing2 actually ran on both hosts even though the previous task had failed. That makes it possible to end up in situations where steps 1 and 3 have run in a deploy even though step 2 didn't run.
I've patched this and if all looks good it'll look like this:
also here is the branch:
referenced this issue
Aug 1, 2013
I would like to talk again about this issue (Yeah I'm
Here is my use case of the parallel paradigm of Fabric:
@runs_once def deploy execute(upload) execute(update) @parallel def upload # upload app artifact @parallel(pool_size=amount_servers/4) def update # stop and start app
called like the following:
> fab -R servers deploy
I would expect Fabric to stop the parallel execution of the
That's quite problematic if the app could not start in the
Do you have an idea on how to solve this?
referenced this issue
Jun 2, 2016
it turns out this PR fixes one of but not all of my problems.
My issue is that the instances are attached to a load balancer on AWS, and can be terminated at random. In testing, it seems as though the terminated instance does not cause failure. Fabric just hangs into perpetuity, waiting for the now non-existent host to finish.
Hi @goosemo, oh that's a nice PR. It does exactly what we need @captaintrain! Thanks!
Here's the output without the PR:
> fab -H local1,local2 deploy [local1] Executing task 'deploy' [local1] Executing task 'upload' [local2] Executing task 'upload' [localhost] local: # hello upload [localhost] local: # hello upload [local1] Executing task 'update' [local2] Executing task 'update' [localhost] local: # hello update [localhost] local: # hello update < I DON'T want this second one to run as the first update fails Fatal error: One or more hosts failed while executing task 'update' Aborting.
And now the output with the PR and the
> fab --parallel-exit-on-errors -H local1,local2 deploy [local1] Executing task 'deploy' [local1] Executing task 'upload' [local2] Executing task 'upload' [localhost] local: # hello upload [localhost] local: # hello upload [local1] Executing task 'update' [local2] Executing task 'update' [localhost] local: # hello update < \o/ It ran only once! Fatal error: One or more hosts failed while executing task 'update' Aborting.