Fix include loading for handler runs #69459

mwhahaha · 2020-05-12T16:32:02Z

SUMMARY

When using the free (or host_pinned) strategy if your playbook includes
a tasks, roles and handlers, an error can occur when handlers are run
and there are included files to process. In the free.py strategy[0], when
included files are run there is an additional check to see if the
included file is a role and to process that differently than if it's an
included file.

[0]

ansible/lib/ansible/plugins/strategy/free.py

Lines 244 to 254 in c870457

    
           try: 
        
               if included_file._is_role: 
        
                   new_ir = self._copy_included_file(included_file) 
        
                   new_blocks, handler_blocks = new_ir.get_block_list( 
        
                       play=iterator._play, 
        
                       variable_manager=self._variable_manager, 
        
                       loader=self._loader, 
        
                   ) 
        
               else: 
        
                   new_blocks = self._load_included_file(included_file, iterator=iterator)

Resolves #69457

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME

ADDITIONAL INFORMATION

When using the free (or host_pinned) strategy if your playbook includes a tasks, roles and handlers, an error can occur when handlers are run and there are included files to process. In the free.py strategy[0], when included files are run there is an additional check to see if the included file is a role and to process that differently than if it's an included file. [0] https://github.com/ansible/ansible/blob/c8704573396e7480b3e1b33b2ddda2b6325d0d80/lib/ansible/plugins/strategy/free.py#L244-L254 Resolves ansible#69457

EmilienM · 2020-05-12T16:44:29Z

👍 with the proposed solution. Thanks for fixing that bug Alex!

sivel · 2020-05-12T16:56:56Z

At a minimum, this is going to need integration tests to validate what you are fixing, as well as a changelog.

Without a reproducer, I'm not sure I fully understand what is being fixed here. Roles cannot be used as handlers.

mwhahaha · 2020-05-12T17:02:57Z

This is not about roles being used as handlers but rather output of IncludedFile.process_include_results can be roles.

sivel · 2020-05-12T17:04:52Z

This is not about roles being used as handlers but rather output of IncludedFile.process_include_results can be roles.

I'm still not sure I understand, but I'll await integration tests that include a reproducer.

mwhahaha · 2020-05-12T17:05:05Z

I'll try and figure out a reproducer but it seems to be a race condition based on when/how this is called when using the free strategy.

mwhahaha · 2020-05-12T17:06:26Z

Right now I have a very complex set of playbooks that hit it at a scale of 13 nodes but not 4, however it is hit consistently on at least 2-3 nodes when deploying my playbooks using an environment of size 13

ansibot · 2020-05-12T17:07:08Z

The test ansible-test sanity --test pep8 [explain] failed with 1 error:

lib/ansible/plugins/strategy/__init__.py:1014:33: E126: continuation line over-indented for hanging indent

click here for bot help

sivel · 2020-05-12T17:54:31Z

I may have an idea what is happening, and if correct, would mean that this fix is insufficient.

Both _wait_on_handler_results and _wait_on_pending_results use _process_pending_results.

So I am guessing that at least 1 host has made it to processing handlers, and other hosts are still executing normal tasks.

As such, _wait_on_handler_results is consuming include results that are meant for _wait_on_pending_results.

As such, while this would fix that situation, it would use _do_handler_run to execute those tasks, rather than run. As such, you wouldn't get true free task execution once this starts happening. This could cause all manner of inconsistencies for callback plugins, and the results of the playbook execution would not be deterministic.

We've been talking about adding a new ITERATING_HANDLERS state to PlayIterator, and to remove _do_handler_run, and inject notified handlers into the PlayIterator. Which is of course a lot more work, but then uses the strategies run method, for executing handlers as well as regular tasks, since _do_handler_run is effectively it's own strategy, that mostly adheres to linear concepts.

mwhahaha · 2020-05-12T18:05:07Z

Yea sounds like what's happening because with free, the handler would get hit on a different host first while others are doing other tasks

mwhahaha · 2020-05-12T18:25:14Z

I think this is triggered because we use meta: flush_handlers in some roles.

mwhahaha · 2020-05-12T18:38:46Z

I wonder if this can be addressed by adding logic to the _execute_meta to only run handlers on thte target host instead of doing so for all systems. Currently it looks like run_handlers doesn't take a host argument so it'll try and run actions everywhere. Other meta actions like clear_facts, clear_host_errors, end_play and end_host take target_host into consideration when executing.

mwhahaha · 2020-05-12T22:44:10Z

Yea this doesn't address the problem. I was sort of able to work around it by treating the flush_handlers task like a regular task when advancing the hosts when running under free by wedging something in around here: https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/strategy/free.py#L192-L194

But I end up with issues in _process_pending_results because the task doesn't get found in the self._queued_task_cache. I'm continuing to try and figure something out that isn't "remove all the handlers".

mwhahaha · 2020-05-13T15:19:53Z

I commented over in the original issue, but I have a basic reproducer available @ https://github.com/mwhahaha/ansible-69457

It reproduces with a decent number of nodes like 13

mwhahaha · 2020-05-13T22:26:02Z

#69498 resolves this

mwhahaha force-pushed the include-fix-69457 branch from b678be2 to 150cdf8 Compare May 12, 2020 16:32

ansibot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. and removed core_review In order to be merged, this PR must follow the core review workflow. labels May 12, 2020

sivel assigned jimi-c May 12, 2020

mattclay added the ci_verified Changes made in this PR are causing tests to fail. label May 13, 2020

mwhahaha closed this May 13, 2020

mkrizek removed the needs_triage Needs a first human triage before being processed. label May 14, 2020

ansible locked and limited conversation to collaborators Jun 10, 2020

mwhahaha deleted the include-fix-69457 branch June 22, 2020 14:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix include loading for handler runs #69459

Fix include loading for handler runs #69459

mwhahaha commented May 12, 2020

EmilienM commented May 12, 2020

sivel commented May 12, 2020

mwhahaha commented May 12, 2020

sivel commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

ansibot commented May 12, 2020

sivel commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 13, 2020

mwhahaha commented May 13, 2020

	try:
	if included_file._is_role:
	new_ir = self._copy_included_file(included_file)

	new_blocks, handler_blocks = new_ir.get_block_list(
	play=iterator._play,
	variable_manager=self._variable_manager,
	loader=self._loader,
	)
	else:
	new_blocks = self._load_included_file(included_file, iterator=iterator)

Fix include loading for handler runs #69459

Fix include loading for handler runs #69459

Conversation

mwhahaha commented May 12, 2020

SUMMARY

ISSUE TYPE

COMPONENT NAME

ADDITIONAL INFORMATION

EmilienM commented May 12, 2020

sivel commented May 12, 2020

mwhahaha commented May 12, 2020

sivel commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

ansibot commented May 12, 2020

sivel commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 12, 2020

mwhahaha commented May 13, 2020

mwhahaha commented May 13, 2020