Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ansible seems to have issues resolving nested variables for delegate_to #79450

Open
1 task done
The-Judge opened this issue Nov 23, 2022 · 7 comments
Open
1 task done
Assignees
Labels
affects_2.13 bug This issue/PR relates to a bug. needs_verified This issue needs to be verified/reproduced by maintainer

Comments

@The-Judge
Copy link

Summary

Under reproducible circumstances, Ansible seems to randomly have issues resolving variables. I provided a demo to show the behavior.

Issue Type

Bug Report

Component Name

delegate_to

Ansible Version

$ ansible --version
ansible [core 2.13.5]
  config file = /Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg
  configured module search path = ['/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/library']
  ansible python module location = /usr/local/Cellar/ansible/6.5.0/libexec/lib/python3.10/site-packages/ansible
  ansible collection location = /Users/mrichter/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.10.8 (main, Oct 13 2022, 10:17:43) [Clang 14.0.0 (clang-1400.0.29.102)]
  jinja version = 3.1.2
  libyaml = True

Configuration

# if using a version older than ansible-core 2.12 you should omit the '-t all'
$ ansible-config dump --only-changed -t all
DEFAULT_ACTION_PLUGIN_PATH(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = ['/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/action_plugins']
DEFAULT_FORKS(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = 20
DEFAULT_HOST_LIST(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = ['/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/inventory']
DEFAULT_MODULE_PATH(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = ['/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/library']
DEFAULT_MODULE_UTILS_PATH(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = ['/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/module_utils']
DEFAULT_ROLES_PATH(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = ['/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/roles']
DEFAULT_VAULT_IDENTITY_LIST(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = ['ansible_default@clients/vault-pass-client.py']
HOST_KEY_CHECKING(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = False
INTERPRETER_PYTHON(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = auto_silent
PLAYBOOK_DIR(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = /Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/playbooks

CONNECTION:
==========

paramiko_ssh:
____________
host_key_checking(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = False

ssh:
___
host_key_checking(/Users/mrichter/Nextcloud/Dokumente/Entwicklung/ansible/ansible.merged.cfg) = False

OS / Environment

Manager node OSes: MacOS, Ubuntu
Node OS (Destinations): Debian

Steps to Reproduce

A complete Ansible setup to reproduce the issue, along with the outputs from an affected run in the README, can be found on this extra provided Repo: https://github.com/The-Judge/ansible-delegate-demo

Expected Results

I expect that ansible is resolving all variables passed to it and does not come up with something half-resolved like:
Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: Name or service not known where inventory_hostname is used as a hostname half way.

Actual Results

fatal: [sbsdevcore01 -> sbsdevcore02]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: nodename nor servname provided, or not known", "unreachable": true}

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@ansibot
Copy link
Contributor

ansibot commented Nov 23, 2022

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot ansibot added affects_2.13 bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. labels Nov 23, 2022
@lathama
Copy link
Contributor

lathama commented Nov 23, 2022

Ansible 2.12 I get a different issue:

use_vars = task_vars.get('ansible_delegated_vars')[self._task.delegate_to]
KeyError: 'sbsdevcore03'

Ansible 2.14 I reproduce as

fatal: [sbsdevcore01 -> sbsdevcore03]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: Name or service not known",
    "unreachable": true
}

Note this appears to happen when the my_execution_node is the same as the current node

ok: [sbsdevcore01] => {
    "msg": "my_execution_node: sbsdevcore01"
}
ok: [sbsdevcore02] => {
    "msg": "my_execution_node: sbsdevcore02"
}
ok: [sbsdevcore03] => {
    "msg": "my_execution_node: sbsdevcore01"
}

and succeeds on for example

ok: [sbsdevcore02] => {
    "msg": "my_execution_node: sbsdevcore02"
}
ok: [sbsdevcore01] => {
    "msg": "my_execution_node: sbsdevcore02"
}
ok: [sbsdevcore03] => {
    "msg": "my_execution_node: sbsdevcore03"
}

So your task run_once: true is picking host[0] and appears to always fail when sbsdevcore01 is delegated to sbsdevcore01

Still trying to test this.

@bcoca
Copy link
Member

bcoca commented Nov 23, 2022

note that run_once: true has a special behavior, it updates the facts for 'all' hosts in the play

@The-Judge
Copy link
Author

The-Judge commented Nov 23, 2022

@lathama : I do not understand, entirely. I do not think that it happens when the randomly selected node is the same as the node that is triggering the delegated execution:

First, sticking to your example, Ansible displays this:

fatal: [sbsdevcore01 -> sbsdevcore03]

The fact, that it says sbsdevcore03 here and not inventory_hostname shows to me, that what Ansible tried to do here is to delegate a task, triggered from sbsdevcore01 to sbsdevcore03. Anyways, in the error, which seems to come from the SSH library Ansible uses to establish connections, seems as if Ansible is trying to trigger a command similar to:

ssh inventory_hostname -- whatever

Comparing the output of SSH (on MacOS):
ansible-delegate-demo $ ssh inventory_hostname ssh: Could not resolve hostname inventory_hostname: nodename nor servname provided, or not known

This looks very familiar - it's the same issue I get on my Mac - Compare with the README.md in the repo at https://github.com/The-Judge/ansible-delegate-demo/blob/main/README.md :

fatal: [sbsdevcore01 -> sbsdevcore02]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: nodename nor servname provided, or not known", "unreachable": true}

It's just what my MacOS SSH version displays ...

Somehow, this information seems not to be handed over to the task, Ansible internal ... I just do not know how to better describe this.

@The-Judge
Copy link
Author

A colleague of mine just identified #28231 which seems to have addressed this in Ansible 2.3 in 2018 already and it seems to be related to random. This really blows my mind:

In line 11 of main.yml, I do this:

my_execution_node: "{{ groups[my_group_name] | random }}"

From my understanding, from there my_execution_node has a fixed value, which can be shown fine with subsequent debug in line 3 of sub_task.yml:

- debug:
    msg: "my_execution_node: {{ my_execution_node }}"

This works fine!

But using the same variable in the shell - task below (in line 9 of sub_task.yml), results in this issue:

- name: This is my sub-task
  shell:
    cmd: echo "Hello World!"
  run_once: true
  delegate_to: "{{ my_execution_node }}"
  delegate_facts: true

And now it comes: If my_execution_node is defined using more or less anything else but random, it works all the time!! Give this line a try instead to set my_execution_node:

my_execution_node: "{{ groups[my_group_name] | first }}"

How to call this anything else but a bug???

@The-Judge
Copy link
Author

Same effect with shuffle; so this is not working, either, so no possible Workaround:

my_execution_node: "{{ groups[my_group_name] | shuffle | first }}"

@The-Judge
Copy link
Author

Found a workaround:

If the random element is picked and assigned to a fact in a YAML file which is executed using include_tasks, it works:

roles/test/tasks/get_random_element.yml:

---
- name: take 'list' and return a single element from it 'elem'
  set_fact:
    elem: "{{ list|random }}"
...

roles/test/tasks/main.yml:

---
- name: get random element from groups[my_group_name]
  include_tasks: get_random_element.yml
  vars:
    list: "{{ groups[my_group_name] }}"

- name: Main Level Task
  include_tasks: sub_task.yml
  vars:
    my_execution_node: "{{ elem }}"
...

@mkrizek mkrizek added needs_verified This issue needs to be verified/reproduced by maintainer and removed needs_triage Needs a first human triage before being processed. labels Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.13 bug This issue/PR relates to a bug. needs_verified This issue needs to be verified/reproduced by maintainer
Projects
None yet
Development

No branches or pull requests

5 participants