Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

any_errors_fatal is not working as expected with block/rescue #49041

Closed
kokasha opened this issue Nov 22, 2018 · 9 comments
Closed

any_errors_fatal is not working as expected with block/rescue #49041

kokasha opened this issue Nov 22, 2018 · 9 comments
Assignees
Labels
affects_2.7 This issue/PR affects Ansible v2.7 bug This issue/PR relates to a bug. support:core This issue/PR relates to code supported by the Ansible Engineering Team.

Comments

@kokasha
Copy link

kokasha commented Nov 22, 2018

SUMMARY

The option "any_errors_fatal" on a block level should (if a single node fails) have all nodes to fail and then execute "rescue" in all nodes. There was a previous discussion regarding this same issue (#14024) and it should be fixed since release 2.0, however, testing with release 2.7.2 i can see that the behavior is not as expected.

ISSUE TYPE
  • Bug Report
COMPONENT NAME

any_errors_fatal with block/rescue

ANSIBLE VERSION

ansible 2.7.2
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/vagrant/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609]
CONFIGURATION
ansible-config dump --only-changed
None
OS / ENVIRONMENT

This is a behavior for ansible execution irrelevant to the target OS.

STEPS TO REPRODUCE

Minimal Test Case

Inventory:
h1 ansible_connection=local
h2 ansible_connection=local
h3 ansible_connection=local
h4 ansible_connection=local
h5 ansible_connection=local

---
- hosts: all
  gather_facts: no
  tasks:
  - block:
      - fail:
        when: inventory_hostname == 'h3'
    rescue:
      - name: Save By Rescue
        debug: msg="here we are in the rescue"
    any_errors_fatal: yes
EXPECTED RESULTS

The expected result is the rescue block will run on all nodes not only the failed one (since we are using any_errors_fatal=yes)

ACTUAL RESULTS

The rescue block is only executed for the failed host not all the hosts


PLAY [all] ************************************************************************************************************************************

TASK [fail] ***********************************************************************************************************************************
skipping: [h1]
skipping: [h2]
fatal: [h3]: FAILED! => {"changed": false, "msg": "Failed as requested from task"}
skipping: [h4]
skipping: [h5]

TASK [Save By Rescue] *************************************************************************************************************************
ok: [h3] => {
    "msg": "here we are in the rescue"
}

PLAY RECAP ************************************************************************************************************************************
h1                         : ok=0    changed=0    unreachable=0    failed=0
h2                         : ok=0    changed=0    unreachable=0    failed=0
h3                         : ok=1    changed=0    unreachable=0    failed=1
h4                         : ok=0    changed=0    unreachable=0    failed=0
h5                         : ok=0    changed=0    unreachable=0    failed=0
@ansibot
Copy link
Contributor

ansibot commented Nov 22, 2018

Hi @kokasha, thank you for submitting this issue!

click here for bot help

@ansibot
Copy link
Contributor

ansibot commented Nov 22, 2018

Files identified in the description:
None

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot ansibot added affects_2.7 This issue/PR affects Ansible v2.7 bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Nov 22, 2018
@mkrizek
Copy link
Contributor

mkrizek commented Nov 23, 2018

FWIW, introduced in ac89b0d. Git bisected to fbec2d9 which "broke" the functionality, however the code changed heavily since then that it's not much useful.

@bcoca
Copy link
Member

bcoca commented Nov 23, 2018

@kokasha In any case your expectation does not match our intended feature.

  • any_errors_fatal produces an immediate and NON recoverable error for all hosts (rescues don't run)
  • any_errors_fatal won't get triggered by a rescued host/task, but will if the rescue itself fails

The behaviour you are looking for can be done with a fail or assert task in the block that compares ansible_play_hosts family of variables and then forces a failure/rescue for all the hosts.

@kokasha
Copy link
Author

kokasha commented Nov 24, 2018

@bcoca Can you please illustrate more what is the expected behavior that we should see?

The behavior that i was expecting is some sort of a rollback behavior (i run a task/tasks inside the block on a number of hosts) and if this task fails on any host i should run the rescue task on All the hosts in order to rollback all the changes that was implemented in the block tasks.

Also in a previous issue that was raised #14024, this behavior this possible and was implemented, so why this is not working at the moment.

Finally, can you please with any simple playbook how to implement this rollback behavior using fail or assert as mentioned in your reply?

Thanks a lot for your support and feedback

@bcoca
Copy link
Member

bcoca commented Nov 26, 2018

What i expect from your playbook:

  • all hosts except h3 skip the 'fail' task and don't run rescue
  • host h3 executes the fail task and then runs the rescue task

The ticket you point at had any_errors_fatal ignore rescue blocks, the fix was to change it not to, this might have allowed the behaviour you say you expect, but subsequent fixes got it working to what we wanted to, which is the behaviour you observe above.

As for the example, you should really try IRC or the mailing lists, but here it is anyways:

- hosts: all
  gather_facts: no
  tasks:
  - block:
      - fail:
        when: inventory_hostname == 'mailer1'
        register: myfail

      - name: fail if any host failed previous task
        fail:
        when: (hostvars.values()|selectattr('myfail', 'failed')|list)|length > 0
    rescue:
      - name: Save By Rescue
        debug: msg="here we are in the rescue"

@bcoca
Copy link
Member

bcoca commented Nov 26, 2018

@abadger abadger removed the needs_triage Needs a first human triage before being processed. label Nov 27, 2018
@bcoca
Copy link
Member

bcoca commented Nov 27, 2018

Possible Misunderstanding

Hi!

Thanks very much for your submission to Ansible. It sincerely means a lot to us.

We believe the ticket you have filed is being somewhat misunderstood, as one thing works a little differently than stated.

As stated in the link above, this is expected behaviour.

In the future, this might be a topic more well suited for the user list, which you can also post here if you'd like some more help with the above.

Thank you once again for this and your interest in Ansible!

@mkrizek
Copy link
Contributor

mkrizek commented Mar 20, 2019

Closing as per above.

@mkrizek mkrizek closed this as completed Mar 20, 2019
@ansible ansible locked and limited conversation to collaborators Jul 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.7 This issue/PR affects Ansible v2.7 bug This issue/PR relates to a bug. support:core This issue/PR relates to code supported by the Ansible Engineering Team.
Projects
None yet
Development

No branches or pull requests

5 participants