Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling updates should allow max percentage of failure #3609

Closed
kavink opened this issue Jul 21, 2013 · 2 comments
Closed

Rolling updates should allow max percentage of failure #3609

kavink opened this issue Jul 21, 2013 · 2 comments
Labels
feature This issue/PR relates to a feature request.

Comments

@kavink
Copy link
Contributor

kavink commented Jul 21, 2013

While doing rolling updates, would be nice to have a configurable parameter to indicate max number of node failures allowed before stopping the playbook. The reason is if there is some issue in upgrade and tomcat or some other step fails to comeup on many machines, We should stop the playbook and not take down majority of the nodes.

i.e. Suppose in playbook i configure,

# From the group of servers , if 5% have failed, Stop running the playbook
 - hosts: {{host_group}}
   serial: 10
   maxfail: 5%

So if you have 300 servers, and serial: 25 and maxfail: 5%. In the first 2 runs combined, (50 servers) , 15 servers failed , like 10 in 1st and 15 in second. Stop the playbook

@mpdehaan
Copy link
Contributor

Yep, this is a very good idea.

For info of others, the current behavior is to stop after "serial: N" number of failures, which will ensure a failed playbook block will not continue to update the whole system if that block fails, but if half that block fails, it would still move onto the next rolling update block.

@kavink kavink closed this as completed Jul 25, 2013
@kavink kavink reopened this Jul 25, 2013
@mpdehaan
Copy link
Contributor

mpdehaan commented Aug 6, 2013

So we already have the pull request so I can close this ticket about the feature.

@mpdehaan mpdehaan closed this as completed Aug 6, 2013
jimi-c pushed a commit that referenced this issue Aug 20, 2013
…ial option, So if total number of failures execeed max_fail_pct * total number of hosts, do not go to the next serial batch
robinro pushed a commit to robinro/ansible that referenced this issue Dec 9, 2016
@ansibot ansibot added feature This issue/PR relates to a feature request. and removed feature_idea labels Mar 2, 2018
@ansible ansible locked and limited conversation to collaborators Apr 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature This issue/PR relates to a feature request.
Projects
None yet
Development

No branches or pull requests

3 participants