Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasks with include_vars take ages :( #13221

Closed
zentavr opened this issue Nov 19, 2015 · 24 comments
Closed

Tasks with include_vars take ages :( #13221

zentavr opened this issue Nov 19, 2015 · 24 comments
Labels
bug This issue/PR relates to a bug. needs_info This issue requires further information. Please answer any outstanding questions.
Milestone

Comments

@zentavr
Copy link

zentavr commented Nov 19, 2015

Issue Type: Bug Report
Ansible Version: 2.0.0 devel@d75e707
Ansible Configuration:

[defaults]
host_key_checking = False
forks=20

Environment: Mac OS X El Capitan 10.11.1

Summary
I have a playbook, which starts like this:

- name: "Setting up all {{ game_env }} variables"
  hosts: tag_deployment_ansible
  gather_facts: False
  connection: local
  no_log: True
  tasks:
    - name: Including the defaults
      include_vars: "variables/_defaults.yml"
      tags: [ 'always' ]

    - name: Including AWS Specific Variables
      include_vars: "variables/{{ aws_env }}.yml"
      tags: [ 'always' ]

    - name: Including AWS Specific Secret Variables
      include_vars: "variables/{{ aws_env }}-secrets.yml"
      tags: [ 'always' ]

    - name: Including Game+AWS Specific Variables
      include_vars: "{{ item }}"
      with_first_found:
        - "variables/{{ game_env }}-{{ aws_env }}.yml"
        - "variables/_defaults-{{ game }}-{{ aws_env }}.yml"
      tags: [ 'always' ]

    - name: Including Game+AWS Specific Secrets
      include_vars: "{{ item }}"
      with_first_found:
        - "variables/{{ game_env }}-{{ aws_env }}-secrets.yml"
        - "variables/_defaults-{{ game }}-{{ aws_env }}-secrets.yml"
      tags: [ 'always' ]

    - name: Including Game Variables
      include_vars: "variables/game-{{ game }}.yml"
      tags: [ 'always' ]

game_env and aws_env come from a command-line (-e parameter).
In order to execute this step i need to wait little less than an hour. Ansible 1.9.4 executes that piece of cake in ~30 seconds. I have up to 70 machines in my inventory.

Expected Results: ~30 seconds

@jimi-c
Copy link
Member

jimi-c commented Nov 23, 2015

@zentavr can you provide any details about how many vars are in these various files?

@jimi-c jimi-c added bug_report needs_info This issue requires further information. Please answer any outstanding questions. labels Nov 23, 2015
@jimi-c jimi-c added this to the v2 milestone Nov 23, 2015
@zentavr
Copy link
Author

zentavr commented Nov 23, 2015

@jimi-c - do you want me to attach files?

@jimi-c
Copy link
Member

jimi-c commented Nov 23, 2015

@zentavr sure, if there's nothing sensitive in them. A gist is also fine.

@zentavr
Copy link
Author

zentavr commented Nov 23, 2015

@jimi-c - there are sensitive values inside, so i need to figure how to the in the best way.

@jimi-c
Copy link
Member

jimi-c commented Nov 23, 2015

@zentavr if you just do a wc -l on each file, that'll be good enough for me to try and reproduce it.

@zentavr
Copy link
Author

zentavr commented Nov 24, 2015

@jimi-c - here it is: https://gist.github.com/zentavr/d3727b53bde29df02250
The passwords are not real.
Here are the host list in the inventory.

ok: [172.29.8.163] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.46.222] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.45.227] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.36.109] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.7.141] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.45.50] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.45.174] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.40.99] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.35.132] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.8.64] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.37.18] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.3.161] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.12.126] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.36.193] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.4.192] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.15.238] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.41.254] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.35.12] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.11.100] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.15.86] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.46.252] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.43.176] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.33.88] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.40.67] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.12.68] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.37.175] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.12.174] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.0.101] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.11.151] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.38.199] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.14.216] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.0.215] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.36.227] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.41.155] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.35.254] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.34.225] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.0.51] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.36.164] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.43.216] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.4.62] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.35.238] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.37.147] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.35.53] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.36.37] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.14.57] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.14.246] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.33.255] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.13.173] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.35.168] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.36.203] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.10.10] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.37.95] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.32.248] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.14.56] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.12.176] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.6.23] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.37.207] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.34.38] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.5.134] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.12.76] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}
ok: [172.29.41.132] => (item=(censored due to no_log)) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"}

@zentavr
Copy link
Author

zentavr commented Nov 24, 2015

I'd upgraded to the 2.0.0-0.6.rc1 and the problem is still there

@zentavr
Copy link
Author

zentavr commented Nov 24, 2015

@jimi-c - I wonder if #12239 could be similar to my problem?

@jimi-c
Copy link
Member

jimi-c commented Nov 25, 2015

@zentavr possibly, though we have very considerably closed the gap on timings now, that issue is being held open as a catch-all until we determine if other types of slowness (like yours) is related or not. Since you're still seeing this, I'd say not.

@zentavr
Copy link
Author

zentavr commented Nov 25, 2015

@jimi-c - I wonder if pull request #13000 exists in v. 2.0.0-0.6.rc1?

@jimi-c
Copy link
Member

jimi-c commented Nov 25, 2015

@zentavr it does.

@zentavr
Copy link
Author

zentavr commented Nov 25, 2015

@jimi-c I'd set var_compression_level=9 and forks=10 (was forks=20) and frankly speaking didn't get the increase of the performance :(

@Yannig
Copy link
Contributor

Yannig commented Nov 25, 2015

Agreed, ansible v2 is a little bit slow here

@Yannig
Copy link
Contributor

Yannig commented Nov 25, 2015

With this playbook, Ansible 2 spent a lot of time in lib/ansible/strategy/__init__.py in _process_pending_results copying variable_manager:

self._tqm._hostvars_manager.hostvars().set_variable_manager(self._variable_manager)

I think a good way to reduce this overhead could be to update existing objects instead of copying them each time.

I'll take a look to see if I can improve something around here.

@bcoca
Copy link
Member

bcoca commented Nov 25, 2015

There are issues with updating the common objects, specially when connection data is set at task level as this might now affect all subsequent tasks.

@Yannig
Copy link
Contributor

Yannig commented Nov 26, 2015

Well, forget the previous fix, I'm working on a better way of handling this by avoiding copy. Here is the resulting patch:

diff --git a/lib/ansible/executor/task_queue_manager.py b/lib/ansible/executor/task_queue_manager.py
index 0f0dba0..0f8f16e 100644
--- a/lib/ansible/executor/task_queue_manager.py
+++ b/lib/ansible/executor/task_queue_manager.py
@@ -202,6 +202,7 @@ class TaskQueueManager:
             #        to do this with a proper BaseProxy/DictProxy derivative
             exposed=(
                 'set_variable_manager', 'set_inventory', '__contains__', '__delitem__',
+                'set_nonpersistent_facts', 'set_host_facts', 'set_host_variable',
                 '__getitem__', '__len__', '__setitem__', 'clear', 'copy', 'get', 'has_key',
                 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'
             ),
diff --git a/lib/ansible/plugins/strategy/__init__.py b/lib/ansible/plugins/strategy/__init__.py
index 405a8c7..1de7897 100644
--- a/lib/ansible/plugins/strategy/__init__.py
+++ b/lib/ansible/plugins/strategy/__init__.py
@@ -275,7 +275,7 @@ class StrategyBase:
                     var_value = wrap_var(result[3])

                     self._variable_manager.set_nonpersistent_facts(host, {var_name: var_value})
-                    self._tqm._hostvars_manager.hostvars().set_variable_manager(self._variable_manager)
+                    self._tqm._hostvars_manager.hostvars().set_nonpersistent_facts(host, {var_name: var_value})

                 elif result[0] in ('set_host_var', 'set_host_facts'):
                     host = result[1]
@@ -300,13 +300,15 @@ class StrategyBase:
                         var_value = result[5]

                         self._variable_manager.set_host_variable(target_host, var_name, var_value)
+                        self._tqm._hostvars_manager.hostvars().set_host_variable(target_host, var_name, var_value)
                     elif result[0] == 'set_host_facts':
                         facts = result[4]
                         if task.action == 'set_fact':
                             self._variable_manager.set_nonpersistent_facts(target_host, facts)
+                            self._tqm._hostvars_manager.hostvars().set_nonpersistent_facts(target_host, facts)
                         else:
                             self._variable_manager.set_host_facts(target_host, facts)
-                    self._tqm._hostvars_manager.hostvars().set_variable_manager(self._variable_manager)
+                            self._tqm._hostvars_manager.hostvars().set_host_facts(target_host, facts)

                 else:
                     raise AnsibleError("unknown result message received: %s" % result[0])
diff --git a/lib/ansible/vars/hostvars.py b/lib/ansible/vars/hostvars.py
index 130bc3f..1c9377e 100644
--- a/lib/ansible/vars/hostvars.py
+++ b/lib/ansible/vars/hostvars.py
@@ -79,6 +79,15 @@ class HostVars(collections.Mapping):
             self._cached_result[sha1_hash] = result
         return result

+    def set_host_variable(self, host, varname, value):
+        self._variable_manager.set_host_variable(host, varname, value)
+
+    def set_nonpersistent_facts(self, host, facts):
+        self._variable_manager.set_nonpersistent_facts(host, facts)
+
+    def set_host_facts(self, host, facts):
+        self.set_host_facts(host, facts)
+
     def __contains__(self, host_name):
         return self._find_host(host_name) is not None

Basicaly, we do not try to copy the whole variables each time a small modification is made. Here, we simply update both variable manager.

With this new patch, Ansible take about 10s vs 4 minutes without.

@jimi-c what do you think about?

@jimi-c
Copy link
Member

jimi-c commented Nov 28, 2015

@Yannig I think that's a good solution, send a PR.

Yannig added a commit to Yannig/yannig-ansible-playbooks that referenced this issue Nov 28, 2015
@Yannig
Copy link
Contributor

Yannig commented Nov 29, 2015

@jimi-c here is the PR #13307

You can also retrieve a playbook exemple at the following location: https://github.com/Yannig/ansible-issue-low-speed/blob/master/big-vars-files/site_provision.yml

To launch it, you need the compute group with hosts within it. Here's an example:

[compute]
compute-[0:19]

And finaly, then following variables:

ansible-playbook site_provision.yml -e game_env=arenamulti -e aws_env=cashdev

@jimi-c jimi-c closed this as completed in 5227c6b Nov 30, 2015
jimi-c pushed a commit that referenced this issue Nov 30, 2015
@zentavr
Copy link
Author

zentavr commented Nov 30, 2015

Everything works except this:

TASK [arenaapp : include] ******************************************************
task path: /Users/zentavr/__repos/ansible/roles/arenaapp/tasks/main.yml:4
Unexpected Exception: UUID('22a719b9-3927-473d-943f-2f186489a0f5') is not JSON serializable
the full traceback was:

Traceback (most recent call last):
  File "/Users/zentavr/__repos/ansible/.venv/bin/ansible-playbook", line 84, in <module>
    sys.exit(cli.run())
  File "/Users/zentavr/__repos/ansible/.venv/lib/python2.7/site-packages/ansible/cli/playbook.py", line 149, in run
    results = pbex.run()
  File "/Users/zentavr/__repos/ansible/.venv/lib/python2.7/site-packages/ansible/executor/playbook_executor.py", line 147, in run
    result = self._tqm.run(play=play)
  File "/Users/zentavr/__repos/ansible/.venv/lib/python2.7/site-packages/ansible/executor/task_queue_manager.py", line 250, in run
    play_return = strategy.run(iterator, play_context)
  File "/Users/zentavr/__repos/ansible/.venv/lib/python2.7/site-packages/ansible/plugins/strategy/linear.py", line 240, in run
    self._queue_task(host, task, task_vars, play_context)
  File "/Users/zentavr/__repos/ansible/.venv/lib/python2.7/site-packages/ansible/plugins/strategy/__init__.py", line 160, in _queue_task
    zip_vars = zlib.compress(json.dumps(task_vars), C.DEFAULT_VAR_COMPRESSION_LEVEL)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: UUID('22a719b9-3927-473d-943f-2f186489a0f5') is not JSON serializable

The problem was gone after I commented the settings variable in my ansible.cfg.
The var was var_compression_level=9 in the [default] section.
I have f926e81 commit of ansible.

@zentavr
Copy link
Author

zentavr commented Nov 30, 2015

FYI @jimi-c and @Yannig

@zentavr
Copy link
Author

zentavr commented Nov 30, 2015

the play was (line no. 4 is that include):

---
# This will set up ELBs before anything else will be done.
# ... this play is going to be executed only on a single node which is localhost
- include: prepare_elbs.yml
  tags: [ 'arena_elbs' ]
  when: prepare_elbs

@jimi-c
Copy link
Member

jimi-c commented Nov 30, 2015

Whoops, that's probably my fault. I added in the role_uuid variable and hadn't tested it with compression enabled, so should be an easy fix.

@jimi-c
Copy link
Member

jimi-c commented Nov 30, 2015

@zentavr fixed that bug via c6a30f7. Let us know (in a new issue) if you're still seeing any problems.

Thanks!

@zentavr
Copy link
Author

zentavr commented Nov 30, 2015

@jimi-c - works great for me with defined var_compression_level in the settings file. Thanks!

@ansibot ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 7, 2018
@ansible ansible locked and limited conversation to collaborators Apr 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue/PR relates to a bug. needs_info This issue requires further information. Please answer any outstanding questions.
Projects
None yet
Development

No branches or pull requests

5 participants