Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing dict variable defined in group_vars to role fails when dict merging is enabled #12915

Closed
ngaya-ll opened this issue Oct 26, 2015 · 19 comments
Labels
affects_1.9 This issue/PR affects Ansible v1.9 bug This issue/PR relates to a bug. needs_info This issue requires further information. Please answer any outstanding questions. needs_template This issue/PR has an incomplete description. Please fill in the proposed template correctly. support:core This issue/PR relates to code supported by the Ansible Engineering Team.

Comments

@ngaya-ll
Copy link

Ansible version: 1.9.4

When dict merging is enabled, passing a dict as a role parameter produces an error.

Example:
Consider the following Ansible project:

├── ansible.cfg
├── group_vars
│   └── all
├── inventory
│   └── local.ini
├── roles
│   └── demo
│       └── tasks
│           └── main.yml
└── test.yml

ansible.cfg:

[defaults]
hash_behaviour = merge

test.yml


---
- hosts: localhost
  roles:
    - role: demo
      role_arg: "{{ foo }}"

group_vars/all:


---
foo:
  bar: baz

inventory/local.ini:

localhost ansible_connection=local

roles/demo/tasks/main.yml:


---
- debug: var=role_arg

Expected behavior:

$ ansible-playbook -i inventory/local.ini test.yml 

PLAY [localhost] **************************************************************

TASK: [demo | debug var=role_arg] ********************************************* 
ok: [localhost] => {
    "var": {
        "role_arg": {
            "bar": "baz"
        }
    }
}

PLAY RECAP ******************************************************************** 
localhost                  : ok=2    changed=0    unreachable=0    failed=0 

Actual behavior:

$ ansible-playbook -i inventory/local.ini test.yml 

PLAY [localhost] **************************************************************

TASK: [demo | debug var=role_arg] ********************************************* 
fatal: [localhost] => failed to combine variables, expected dicts but got a 'dict' and a 'unicode'

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
localhost                  : ok=1    changed=0    unreachable=1    failed=0 
@ngaya-ll
Copy link
Author

This works as expected when dict merging is not enabled. This also works when the variable is accessed through hostvars:

---
- hosts: localhost
  roles:
    - role: demo
      role_arg: "{{ hostvars[inventory_hostname]['foo'] }}"

@dagwieers
Copy link
Contributor

We have the exact same error.
And it works fine with Ansible 2.0.0.2, but fails on 1.9.4 and 1.8.4.

When debugging it seems that it wants to merge a dictionary with a string, after digging into combine_vars(a, b) it appears that the value of a is a complete dict (with the contents of packages.foo_api already merged) and the value of b is the string "{{ packages.foo_api }}".

See https://github.com/ansible/ansible/blob/stable-1.9/lib/ansible/utils/__init__.py#L1501

Why the additional merge of something that was already successfully merged is beyond me at this point.

Here is our example:

ansible.cfg

[defaults]
hostfile = hosts
hash_behaviour = merge

group_vars/all

packages:
  foo_api:
    name: api
    version: 1.2.3
    tomcat: 10

roles/foo-framework/tasks/main.yml

---
- name: '{{ foo.name }} : Test merging of vars'
  debug:
    var: foo

test62.yml

---
- hosts: localhost
  gather_facts: no
  roles:
    - role: foo-framework
      foo: '{{ packages.foo_api }}'

With the following output:

[dag@moria ansible.testing]$ swap-ansible ansible-1.9.4 
Changing Ansible version
    ansible-2.0.0.2  ->  ansible-1.9.4
[dag@moria ansible.testing]$ ansible-playbook -l localhost test62.yml

PLAY [localhost] ************************************************************** 

TASK: [foo-framework | {{ foo.name }} : Test merging of vars] ***************** 
fatal: [localhost] => failed to combine variables, expected dicts but got a 'dict' and a 'unicode'

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/home/dag/test62.retry

localhost                  : ok=1    changed=0    unreachable=1    failed=0   

[dag@moria ansible.testing]$ swap-ansible ansible-2.0.0.2
Changing Ansible version
    ansible-1.9.4  ->  ansible-2.0.0.2
[dag@moria ansible.testing]$ ansible-playbook -l localhost test62.yml

PLAY ***************************************************************************

TASK [foo-framework : api : Test merging of vars] ******************************
ok: [localhost] => {
    "foo": {
        "name": "api", 
        "tomcat": 10, 
        "version": "1.2.3"
    }
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0   

@dagwieers
Copy link
Contributor

Using pdb:

(Pdb) c
> /opt/ansible/ansible-1.9.4/lib/ansible/utils/__init__.py(816)merge_hash()
-> _validate_both_dicts(a, b)
(Pdb) a
a = {'version': '1.2.3', 'name': 'api', 'tomcat': 10}
b = {{ packages.foo_api }}
(Pdb) w
  /usr/local/bin/ansible-playbook(324)<module>()
-> sys.exit(main(sys.argv[1:]))
  /usr/local/bin/ansible-playbook(264)main()
-> pb.run()
  /opt/ansible/ansible-1.9.4/lib/ansible/playbook/__init__.py(348)run()
-> if not self._run_play(play):
  /opt/ansible/ansible-1.9.4/lib/ansible/playbook/__init__.py(789)_run_play()
-> if not self._run_task(play, task, False):
  /opt/ansible/ansible-1.9.4/lib/ansible/playbook/__init__.py(497)_run_task()
-> results = self._run_task_internal(task, include_failed=include_failed)
  /opt/ansible/ansible-1.9.4/lib/ansible/playbook/__init__.py(439)_run_task_internal()
-> results = runner.run()
  /opt/ansible/ansible-1.9.4/lib/ansible/runner/__init__.py(1498)run()
-> results = [ self._executor(h, None) for h in hosts ]
  /opt/ansible/ansible-1.9.4/lib/ansible/runner/__init__.py(586)_executor()
-> exec_rc = self._executor_internal(host, new_stdin)
  /opt/ansible/ansible-1.9.4/lib/ansible/runner/__init__.py(668)_executor_internal()
-> inject = self.get_inject_vars(host)
  /opt/ansible/ansible-1.9.4/lib/ansible/runner/__init__.py(647)get_inject_vars()
-> inject = utils.combine_vars(inject, self.role_params)
  /opt/ansible/ansible-1.9.4/lib/ansible/utils/__init__.py(1494)combine_vars()
-> return merge_hash(a, b)
  /opt/ansible/ansible-1.9.4/lib/ansible/utils/__init__.py(825)merge_hash()
-> result[k] = merge_hash(a[k], v)
> /opt/ansible/ansible-1.9.4/lib/ansible/utils/__init__.py(816)merge_hash()
-> _validate_both_dicts(a, b)

@bcoca
Copy link
Member

bcoca commented Feb 18, 2016

The issue here is that templating does not happen until the last possible moment (mostly task execution), in previous versions Ansible would template at several stages while ignoring errors as data was not available (which current behavior guarantees is an error and does not ignore).

@dagwieers
Copy link
Contributor

I have a simple workaround that in case the instance of b is of type basestring, it returns the a dict.
That seems what we want in our specific case, but it doesn't solve the real issue which is why the deliberate Exception was added.

@dagwieers
Copy link
Contributor

@ngaya-ll Can you confirm that this fix works in your case for v1.9 ?
It does for our use-case and I am convinced we hit the exact same bug.

@ngaya-ll
Copy link
Author

From what @bcoca said, the fact that the second argument is a string is a bug, and ignoring it would just be masking the real issue.

@dagwieers
Copy link
Contributor

The fix I propose is not ignoring the string. I am turning the jinja template into the dictionary we were expecting. Look at PR #14565

@ngaya-ll
Copy link
Author

My bad, I thought your above comment was describing the content of the pull request. Looking at the code, your change looks reasonable, but now I'm having trouble reproducing the original error.

@dagwieers
Copy link
Contributor

If you have already moved to v2.0+, the issue does not exist on v2.0 and newer, only starting from v1.8 to v1.9.4.

@ngaya-ll
Copy link
Author

Yep, I'm using 1.9.4

@ngaya-ll
Copy link
Author

My bad, was forgetting the British spelling in hash_behaviour. Now I can reliably reproduce the error.

Your fix did not solve the problem for my original test case. But a similar change with self.role_params did.

@dagwieers
Copy link
Contributor

You are right, I was testing a few things and the PR got mixed up. I updated the PR with the change for role_params.

PS Today I had the exact same issue when making a minimal test-case. Took me some time to pin it to a misspelled "hash_behaviour" ;-)

dagwieers added a commit to dagwieers/ansible that referenced this issue Feb 19, 2016
This fixes ansible#12915

Since combine_vars() is being run directly on role_params, we have to avoid merge_hash() to complain about merging a dict with a string (jinja template).
@ngaya-ll
Copy link
Author

This fix still seems fragile to me. I would worry that the wrong values would get substituted in the template. For instance, the inject dict used for templating the role parameters will not include extra_vars from the command line.

I've seen a couple other bugs where dict merging results in mysterious type errors:

  • Merging templated variables in group_vars/foo with a non-templated dict in group_vars/all
  • Using templated variables from role vars in handlers (works when the variable is moved to role defaults)

The fundamental problem seems to be the ill-defined model of how hash merging interacts with variable resolution. Does dict merging occur before or after template expansion? The answer seems to be "neither", some variables are templated and then merged, other things are merged before templating. It's also unclear what the correct behavior would be. If dict merging occurs first, then we get errors like this bug. If templating occurs first then the values picked up by the template will not reflect the final merged values.

I'd be interested to learn how/if Ansible 2.0 solves this problem.

@dagwieers
Copy link
Contributor

I don't disagree.

The problem is that with Ansible 2.0 a lot has been rewritten (and roles are a complete different beast) so you cannot simply pinpoint what change fixed it. (In fact, when tracing Ansible 2.0 does not call combine_vars() in the same way during the run)

But this issue is very specific to merge_hash() behaviour, and templating at inject time is required for merge_hash() because otherwise we cannot merge dicts at that time as would be expected. So there aren't many options. Either we template beforehand, or we just skip merge_hash() if we are not given two dicts.

BTW as I indicated above the case where it fails, it is in fact merging the same thing, albeit untemplated, onto itself so I don't see the point in doing this in the first place. Look at the below. Dict a and dict b are in fact the exact same thing, but one is templated, the other is not.

(Pdb) c
> /opt/ansible/ansible-1.9.4/lib/ansible/utils/__init__.py(816)merge_hash()
-> _validate_both_dicts(a, b)
(Pdb) a
a = {'version': '1.2.3', 'name': 'api', 'tomcat': 10}
b = {{ packages.foo_api }}

@dagwieers
Copy link
Contributor

@ngaya-ll The alternative solution is to avoid untemplated strings (but still report everything else).

Adding the below to the start of hash_merge():

    # if a is a string, check if it is a template, if so return the other dict
    if isinstance(a, basestring) and a.strip().startswith('{{'):
       if isinstance(b, dict):
           return b.copy()

    # if b is a string, check if it is a template, if so return the other dict
    if isinstance(b, basestring) and b.strip().startswith('{{'):
       if isinstance(a, dict):
           return a.copy()

@dagwieers
Copy link
Contributor

@bcoca @jimi-c What strategy do we want to use to get this issue fixed in v1.9 ?

@ansibot
Copy link
Contributor

ansibot commented Apr 11, 2017

@ngaya-ll Greetings! Thanks for taking the time to open this issue. In order for the community to handle your issue effectively, we need a bit more information.

Here are the items we could not find in your description:

  • issue type
  • component name

Please set the description of this issue with this template:
https://raw.githubusercontent.com/ansible/ansible/devel/.github/ISSUE_TEMPLATE.md

click here for bot help

@ansibot ansibot added needs_info This issue requires further information. Please answer any outstanding questions. needs_template This issue/PR has an incomplete description. Please fill in the proposed template correctly. labels Apr 11, 2017
@ansibot ansibot added the support:core This issue/PR relates to code supported by the Ansible Engineering Team. label Jun 29, 2017
@ansibotdev
Copy link

@ngaya-ll You have not responded to information requests in this issue so we will assume it no longer affects you. If you are still interested in this, please create a new issue with the requested information.

click here for bot help

@ansibot ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 7, 2018
@ansible ansible locked and limited conversation to collaborators Apr 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_1.9 This issue/PR affects Ansible v1.9 bug This issue/PR relates to a bug. needs_info This issue requires further information. Please answer any outstanding questions. needs_template This issue/PR has an incomplete description. Please fill in the proposed template correctly. support:core This issue/PR relates to code supported by the Ansible Engineering Team.
Projects
None yet
Development

No branches or pull requests

5 participants