New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_failed_percentage exit 0 #4407

Closed
dominis opened this Issue Oct 9, 2013 · 5 comments

Comments

Projects
None yet
4 participants
@dominis

dominis commented Oct 9, 2013

max_failed_percentage always sends a non zero exit status even if the percentage is not exceed.

---
- hosts: app-pool
  gather_facts: False
  user: dominis
  serial: 32
  max_fail_percentage: 50

  tasks:
    - action: ping
dominis@shuriken:~/work/temp$ ansible-playbook a.yml -f 32

PLAY [app-pool] ***************************************************************

TASK: [ping] ******************************************************************
ok: [app01.bfc.kinja-ops.com]
fatal: [app10.bfc.kinja-ops.com] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
ok: [app08.bfc.kinja-ops.com]
ok: [app02.xyz.kinja-ops.com]
ok: [app11.xyz.kinja-ops.com]
ok: [app13.xyz.kinja-ops.com]
ok: [app02.bfc.kinja-ops.com]
ok: [app12.xyz.kinja-ops.com]
ok: [app15.bfc.kinja-ops.com]
ok: [app05.xyz.kinja-ops.com]
ok: [app09.bfc.kinja-ops.com]
ok: [app16.bfc.kinja-ops.com]
ok: [app04.xyz.kinja-ops.com]
ok: [app06.xyz.kinja-ops.com]
ok: [app09.xyz.kinja-ops.com]
ok: [app11.bfc.kinja-ops.com]
ok: [app14.bfc.kinja-ops.com]
ok: [app14.xyz.kinja-ops.com]
ok: [app15.xyz.kinja-ops.com]
ok: [app16.xyz.kinja-ops.com]
ok: [app07.bfc.kinja-ops.com]
ok: [app03.xyz.kinja-ops.com]
ok: [app12.bfc.kinja-ops.com]
ok: [app13.bfc.kinja-ops.com]
ok: [app05.bfc.kinja-ops.com]
ok: [app03.bfc.kinja-ops.com]
ok: [app07.xyz.kinja-ops.com]
ok: [app10.xyz.kinja-ops.com]
ok: [app04.bfc.kinja-ops.com]
ok: [app01.xyz.kinja-ops.com]
ok: [app06.bfc.kinja-ops.com]
ok: [app08.xyz.kinja-ops.com]

PLAY RECAP ********************************************************************
           to retry, use: --limit @/Users/dominis/a.retry

app01.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app01.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app02.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app02.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app03.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app03.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app04.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app04.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app05.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app05.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app06.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app06.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app07.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app07.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app08.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app08.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app09.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app09.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app10.bfc.kinja-ops.com    : ok=0    changed=0    unreachable=1    failed=0
app10.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app11.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app11.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app12.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app12.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app13.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app13.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app14.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app14.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app15.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app15.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app16.bfc.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0
app16.xyz.kinja-ops.com    : ok=1    changed=0    unreachable=0    failed=0

dominis@shuriken:~/work/temp$ echo $?
3

I don't think this should be the same as this:

---
- hosts: app10.bfc.kinja-ops.com
  gather_facts: False
  user: dominis

  tasks:
    - action: ping
dominis@shuriken:~/work/temp$ ansible-playbook a.yml -f 32

PLAY [app10.bfc.kinja-ops.com] ************************************************

TASK: [ping] ******************************************************************
fatal: [app10.bfc.kinja-ops.com] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue

FATAL: all hosts have already failed -- aborting

PLAY RECAP ********************************************************************
           to retry, use: --limit @/Users/dominis/a.retry

app10.bfc.kinja-ops.com    : ok=0    changed=0    unreachable=1    failed=0

dominis@shuriken:~/work/temp$ echo $?
3

ansible version:

dominis@shuriken:~/work/temp$ ansible-playbook --version
ansible-playbook 1.4 (devel 41d382b36d) last updated 2013/10/09 12:14:59 (GMT +200)
@mpdehaan

This comment has been minimized.

Contributor

mpdehaan commented Oct 9, 2013

Hi Nandor,

In the above you have set max_failure_percentage to 50% and you have had one failure.

So yes, it's doing the right thing in going on -- but I can see what you mean about having a nice way to make sure this failure shows up in Jenkins.

Perhaps we should set a global flag and if we would have returned 0 and that flag is set, we can return a new exit code that we have previously unused (like, I believe, 4) and print a message at the bottom.

This is a very good idea, so I'm tagging this as a priority feature request.

@dominis

This comment has been minimized.

dominis commented Oct 9, 2013

Hey Michael,

thanks for the quick response!

@mpdehaan

This comment has been minimized.

Contributor

mpdehaan commented Oct 9, 2013

Ordinarily speaking we should return non-zero if there are any failures, so this seems more like max_fail just eating the error code.

But yes, agree that there should be a code here.

@mpdehaan mpdehaan added bug_report and removed feature_idea labels Mar 14, 2014

@mpdehaan

This comment has been minimized.

Contributor

mpdehaan commented Mar 14, 2014

Reclassifying this as a bug report so we can investigate and close this out.

@mpdehaan mpdehaan added P2 and removed P2 labels Mar 19, 2014

jctanner added a commit that referenced this issue Mar 21, 2014

@jctanner

This comment has been minimized.

Member

jctanner commented Mar 21, 2014

@dominis I did find a bug in how the failed percentage was calcuated when serial is set, which is now fixed by 5b3b9ba

http://docs.ansible.com/playbooks_delegation.html#maximum-failure-percentage
'' it may be desirable to abort the play when a certain threshold of failures have been reached"

Max fail percentage was only intended to abort the play if the threshold was reached, but has never been a way to alter the final exit code and override detected host failure/unreachable. We are going to keep that behavior as is, because a change could be very confusing for users who rely on the exit code 0 to mean a perfect execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment