Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"retries" without "until": retry until success #20802

Open
candlerb opened this issue Jan 29, 2017 · 12 comments

Comments

Projects
None yet
8 participants
@candlerb
Copy link
Contributor

commented Jan 29, 2017

ISSUE TYPE
  • Feature Idea
COMPONENT NAME

retries

ANSIBLE VERSION
ansible 2.2.1.0
  config file =
  configured module search path = Default w/o overrides
CONFIGURATION

N/A

OS / ENVIRONMENT

N/A

SUMMARY

At the moment, giving "retries:" without an "until:" condition has no effect.

It would be useful if this could be implemented to mean "retry until success"

(Or if not, at least generate an error saying that retries has no effect without until)

STEPS TO REPRODUCE
- hosts: localhost
  tasks:
  - file: path=/tmp/zzz state=absent

  - name: testing timeout
    get_url:
      url: http://1.2.3.4/zzz
      dest: /tmp/zzz
    retries: 2
    delay: 3
EXPECTED RESULTS

The module is retried 2 times.

ACTUAL RESULTS

The module is run only once.

PLAY [localhost] ***************************************************************

TASK [setup] *******************************************************************
ok: [localhost]

TASK [file] ********************************************************************
ok: [localhost]

TASK [testing timeout] *********************************************************
fatal: [localhost]: FAILED! => {"changed": false, "dest": "/tmp/zzz", "failed": true, "msg": "Request failed: <urlopen error timed out>", "state": "absent", "url": "http://1.2.3.4/zzz"}
	to retry, use: --limit @/home/ubuntu/ert.retry

To get the desired behaviour you have to add three lines:

    register: result
    until: result|succeeded
    ignore_errors: yes

But an unfortunate side-effect of ignore_errors is that a failure even after all the retries is ignored:

TASK [testing timeout] *********************************************************
FAILED - RETRYING: TASK: testing timeout (2 retries left).
FAILED - RETRYING: TASK: testing timeout (1 retries left).
fatal: [localhost]: FAILED! => {"attempts": 2, "changed": false, "dest": "/tmp/zzz", "failed": true, "msg": "Request failed: <urlopen error timed out>", "state": "absent", "url": "http://1.2.3.4/zzz"}
...ignoring

So you need to add another task, something like this:

  - name: report error
    fail:
      msg: 'Unable to download {{result.url}}: {{result.response|default(result.msg)}}'
    when: not result|succeeded

This is pretty messy when all you wanted to do is say "if this task fails, retry it up to N times and then give up"

feanil added a commit to edx/configuration that referenced this issue Sep 5, 2017

`retries` don't work without `until`
The `retries` statement is just ignored unless an `until` clause
is provided.

ansible/ansible#20802

feanil added a commit to edx/configuration that referenced this issue Sep 6, 2017

`retries` don't work without `until`
The `retries` statement is just ignored unless an `until` clause
is provided.

ansible/ansible#20802

@feanil feanil referenced this issue Sep 6, 2017

Merged

`retries` don't work without `until` #4079

0 of 6 tasks complete

feanil added a commit to edx/configuration that referenced this issue Sep 6, 2017

`retries` don't work without `until`
The `retries` statement is just ignored unless an `until` clause
is provided.

ansible/ansible#20802
@wvidana

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2017

Yup, this just happened to me on Ansible 2.3. The run failed and aborted instead of retrying (I didn't set until neither, I expected it to retry until task succeded)

ansible 2.3.2.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = Default w/o overrides
  python version = 2.7.13 (default, Feb 13 2017, 15:15:09) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]

@ansibot ansibot added feature and removed feature_idea labels Mar 2, 2018

@MPV

This comment has been minimized.

Copy link

commented May 15, 2018

Is this maybe solved now using the failed_when feature?

See:
http://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html#controlling-what-defines-failure

I have (so far) successfully migrated my tasks into failed_when from the above mentioned "fail: when: not result|succeeded" method.

@MPV

This comment has been minimized.

Copy link

commented May 15, 2018

A complete example of what I changed it into to get it to work:

- name: check nexus
  get_url:
    url: "http://{{ nexus_host }}/nexus/"
    dest: /tmp/nexus
    force: yes
  register: nexus_result
  until: nexus_result|succeeded
  retries: 10
  delay: 15
  failed_when: nexus_result is failure
@candlerb

This comment has been minimized.

Copy link
Contributor Author

commented May 15, 2018

I have retested with ansible 2.5, and now the "ignore_errors:" part is no longer needed.

So the following works correctly, in both success and failure situations:

- hosts: localhost
  tasks:
  - file: path=/tmp/zzz state=absent

  - name: testing timeout
    get_url:
      url: http://1.2.3.4/zzz
      dest: /tmp/zzz
    retries: 2
    delay: 3
    register: result            # <<<
    until: result is succeeded  # <<<

  - debug:
      msg: finished

There is no benefit from adding failed_when: result is failed (although it doesn't stop it working).

Specifying retries: without until: still does nothing though, meaning you need to add the two highlighted lines.

It's better than it was, but I still think the default for retries: should be "until success", or at least emit a warning that retries: without until: does nothing.

@wvidana

This comment has been minimized.

Copy link
Contributor

commented May 15, 2018

I believe this is still happening on 2.5

@candlerb

This comment has been minimized.

Copy link
Contributor Author

commented May 15, 2018

@woqer: what exactly is still happening? Can you provide a sample playbook?

The playbook I just posted was tested with ansible 2.5.0-1ppa~xenial

It shows that you still need an until: condition with retries:, but you no longer need ignore_errors: nor a subsequent playbook task to catch errors.

@wvidana

This comment has been minimized.

Copy link
Contributor

commented May 15, 2018

@candlerb you still need until: with retries: on 2.5.1, sorry I wasn't specific before

@jeking3

This comment has been minimized.

Copy link
Contributor

commented Dec 10, 2018

retries: without any other conditions should check for succeeded by default; please implement this!

@SJrX

This comment has been minimized.

Copy link

commented Feb 22, 2019

Yes please do, it was very confusing. At the very least implement an error: For example the api.ipify.org fails sometimes so I tried:

- name: "Determine current public IP Address"
  ipify_facts:
  register: ipify_facts
  retries: 10
  delay: 5

That doesn't work but doesn't generate an error or anything.

@captainjs

This comment has been minimized.

Copy link

commented May 10, 2019

Hello,
It's been a while (over a year) since that problem was reported.
The situation is the same today: the retry was not working because we haven't provided an "until".
That's documented, but still, on operational level we are expecting to have a warning displayed, "warning: a reply without until is present, so the retry is ignored" or something like that.

A user just wasted a few hours on that until we finally realised what was happening.

At this point, it seems to be lost in merging, various blabla about it, and the issue is not addressed.

Can someone clarify the status?

@jeking3

This comment has been minimized.

Copy link
Contributor

commented May 15, 2019

The point of this issue is to determine what happens when until is removed. You provided it for both tests.

@captainjs

This comment has been minimized.

Copy link

commented May 16, 2019

The point of this issue is to determine what happens when until is removed. You provided it for both tests.

Ok but is there anyone working on ensuring that there's at least a warning when there's only "reply" and no "until"? So that the user can see it's ignored?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.