Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New "quit if playbook doesn't match" hosts breaks provisioning #1187

Closed
titanous opened this issue Oct 1, 2012 · 21 comments · Fixed by #1202
Closed

New "quit if playbook doesn't match" hosts breaks provisioning #1187

titanous opened this issue Oct 1, 2012 · 21 comments · Fixed by #1202
Milestone

Comments

@titanous
Copy link

titanous commented Oct 1, 2012

I use ansible to provision new EC2 instances on boot, and a2f76c1 breaks provisioning because it doesn't skip playbooks that don't apply, it just quits.

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 1, 2012

Help me understand here.

If you had a play that completely failed for all hosts, or didn't select
any hosts, that seems like a good case to stop the playbook run to me.

(If you have steps that you know will fail and you are ok with this, this
is what ignore_errors has been for)

On Mon, Oct 1, 2012 at 4:25 PM, Jonathan Rudenberg <notifications@github.com

wrote:

I use ansible to provision new EC2 instances on boot, and a2f76c1a2f76c1c696288f4e6741829da4be6dba21b1658breaks provisioning because it doesn't skip playbooks that don't apply, it
just quits.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1187.

@titanous
Copy link
Author

titanous commented Oct 1, 2012

Maybe I'm doing this wrong, but I have a playbook that includes all the setup playbooks for each node type, and then I just use EC2 tags as the host groups to pick which playbooks apply. So the provisioning script runs ansible-playbook with the "setup" playbook, and only the playbooks that apply to that node type end up getting run, the others are skipped.

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 1, 2012

Understood. and you are right.

I'll check and make sure this doesn't happen unless there are are actually
nodes messaged as part of the play tonight.

If it doesn't select any nodes, it should abort only the play, not the
whole playbook.

Thanks for the report!

On Mon, Oct 1, 2012 at 4:30 PM, Jonathan Rudenberg <notifications@github.com

wrote:

Maybe I'm doing this wrong, but I have a playbook that includes all the
setup playbooks for each node type, and then I just use EC2 tags as the
host groups to pick which playbooks apply. So the provisioning script runs
ansible-playbook with the "setup" playbook, and only the playbooks that
apply to that node type end up getting run, the others are skipped.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1187#issuecomment-9048575.

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

Ok, so I have tested this with a couple of host groups and it will only do the "Hard Stop" if the failure is after the setup step, which should fix this for you.

Please update to get this commit -- 5683277 -- and let me know if you're fixed up.

I am pretty sure you should be. Thanks!

@titanous
Copy link
Author

titanous commented Oct 2, 2012

Thanks, I'll test this out tomorrow.

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

Thinking, it may not work correctly if gather_facts is disabled, the way I am counting tasks, I'll make a quick change if so!

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

nm, think I'm wrong.

Thanks!

@dagwieers
Copy link
Contributor

For me this is still and issue. If I have a set of plays, and one of the plays has 'hosts: webservers', while I am running the playbook on systems not in webservers, the task aborts the whole playbook in all cases. So it means I can no longer have plays in a playbook that excludes systems I run the playbook for.

@dagwieers
Copy link
Contributor

I think we need to take into account the number of skipped hosts, if that's not zero we should not abort the playbook.

@dagwieers
Copy link
Contributor

Looking at the code, I fail to understand how the information about whether there are hosts skipped can be used there where we need this information.

I fail to understand how this is designed, this decision is made inside _run_task, which is not where such a decision ought to be IMO. It should be on a higher level where before running the task itself, it evaluates whether there is a need to do it based on the amount of "non-failed" systems.

That would also eliminate the fact that you print the task when aborting. Because when you print the task and then abort you make the user believe that the cause is this task, but it is not.

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

On Tue, Oct 2, 2012 at 6:56 AM, Dag Wieers notifications@github.com wrote:

Looking at the code, I fail to understand how the information about
whether there are hosts skipped can be used there where we need this
installation.

I fail to understand how this is designed, this decision is made inside
_run_task, which is not where such a decision ought to be IMO. It should be
on a higher level where before running the task itself, it evaluates
whether there is a need to do it based on the amount of "non-failed"
systems.

Let's concentrate on how to reproduce the problem and leave how it's
designed for me to figure out :)

@dagwieers
Copy link
Contributor

Then here's how to reproduce:

---
- hosts: emptygroup
  tasks:
  - action: command date

- hosts: all
  tasks:
  - action: command true

This fails for all systems not in emptygroup. While it should only skip the first play, and run the second for each system.

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

On Tue, Oct 2, 2012 at 8:04 AM, Dag Wieers notifications@github.com wrote:

Then here's how to reproduce:

---- hosts: emptygroup
tasks:

  • action: command date
    • hosts: all
      tasks:
  • action: command true

This fails for all systems not in emptygroup. While it should only skip
the first play, and run the second for each system.

That worked fine for me last night, actually.

Have you pulled latest?

@dagwieers
Copy link
Contributor

I did so two hours ago. Output of pull request was:

remote: Counting objects: 140, done.
remote: Compressing objects: 100% (43/43), done.
remote: Total 105 (delta 73), reused 88 (delta 56)
Receiving objects: 100% (105/105), 16.35 KiB, done.
Resolving deltas: 100% (73/73), completed with 33 local objects.
From github.com:ansible/ansible
 * branch            devel      -> FETCH_HEAD
Updating 880328c..f897f19

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

Ok, I'll recheck tonight, though pretty sure this is what I did. Perhaps testing a non-existant pattern versus an empty group made a difference

@titanous still curious on your feedback

@ghost ghost assigned mpdehaan Oct 2, 2012
@dagwieers
Copy link
Contributor

Just did a new pull:

[dag@moria ansible]$ git pull upstream devel
From github.com:ansible/ansible
 * branch            devel      -> FETCH_HEAD
Current branch devel is up to date.

RPM package that I am using is: ansible-0.8-0.git201210020441.el6.noarch which matches with the log:

 commit f897f19fc5ac783e764244e066adf58099acf520
 Author: Michael DeHaan <michael.dehaan@gmail.com>
 Date:   Mon Oct 1 22:41:00 2012 -0400

     Teach fireball mode to disable the fireball by paying attention to 'minutes=N' (default 30) and do not let fireball module crash
     on input.

(Give or take a couple of timezones)

@dagwieers
Copy link
Contributor

I have it working for me now, I will send a pull-request for consideration once I have copied my modifications to an Internet-bound system ;-)

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

Meanwhile I am making sure I have no unpushed code :)

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

yeah it's current -- interested to see what your fix was and how to cause it.

@titanous
Copy link
Author

titanous commented Oct 2, 2012

I'm still getting this message and an immediate abort after hitting a play that doesn't match any hosts when using the latest commit from devel:

FATAL: no hosts matched or all hosts have already failed -- aborting

@mpdehaan
Copy link
Contributor

mpdehaan commented Oct 2, 2012

dag has a pull request I haven't merged just yet....

@ansible ansible locked and limited conversation to collaborators Apr 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants