Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fact gathering optimization for scalability #2553

Closed
bengland2 opened this issue May 2, 2018 · 4 comments
Closed

fact gathering optimization for scalability #2553

bengland2 opened this issue May 2, 2018 · 4 comments

Comments

@bengland2
Copy link
Contributor

The gather and delegate facts task is not scalable, and requires in general O(N^2) fact-gathering operations per host, but it is not clear to me why that should be. I have run tests with this patch that just gathers facts on a host ONCE, instead of N times, and ceph-ansible succeeds just the same. The effect on memory consumption and time to execute this task is dramatic. Particularly for HCI configurations, this could make a big difference in scalability of ceph-ansible.

--- /usr/share/ceph-ansible/site-docker.yml.sample.v31_orig	2018-04-30 19:53:30.539611319 +0000
+++ /usr/share/ceph-ansible/site-docker.yml.sample	2018-04-30 22:05:00.561611319 +0000
@@ -24,26 +24,16 @@
     - name: gather facts
       setup:
       when:
-        - not delegate_facts_host | bool or inventory_hostname in groups.get('clients', [])
+        - not delegate_facts_host | bool
 
     - name: gather and delegate facts
       setup:
-      delegate_to: "{{ item }}"
-      delegate_facts: True
-      with_items: "{{ groups['all'] | difference(groups.get('clients', [])) }}"
       when:
         - delegate_facts_host | bool

Here's a graph of ansible RSS and virtual memory consumption before the patch, with --forks 25, 3 [osds] hosts, 3 [mons] hosts, and 80 [clients] hosts, running ansible 2.4.3 and ceph-ansible-3.1.0-0.1.beta8 starting at 18:35:
cephans31-fork25

And here's the same exact run with the patch, starting at 19:56:

cephans31-factpatch-fork25

elapsed time for this one fact gathering task drops from 5 minutes to about 1 minute, and RSS drops from 8 GB to about 2 GB.

@bengland2
Copy link
Contributor Author

The above patch should not work in general, as @guits explained to me, because ansible modules are interpreting vars about host B on the remote system A, so facts have to be gathered about B on the remote system A to allow these variables to be used (his example below).

However, it may be possible to cut down on the possible (A, B) combinations that have to be fact-gathered to lower the work done in this task from O(N^2) to O(N), where N is the number of hosts in the ansible inventory file.

  • RADOS clients and OSDs find out about other OSDs, storage pools, etc. by talking to the monitors, not through ceph-ansible. So in theory it should be sufficient for each non-monitor host to gather facts about itself and about the 3 monitor hosts (so that ceph.conf can be constructed).
  • we don't need all facts about the monitor hosts, probably just a subset including network information.
  • monitors have to know about all other hosts so that they can construct crush maps, build MDS filesystems, etc. But there are only 3 of these, so it's not so bad.

This might not be that hard to verify -- i.e. search for references to ansible "groups" var. A preliminary search seemed to confirm this.

So the resulting patch might look something like this - only limited testing so far:

    # fact gathering delegation allows modules on remote hosts
    # to access information about other hosts
    
    # this step allows monitors to know cluster configuration
    
    - name: monitors gather facts about everyone
      setup:
      delegate_to: "{{ item }}"
      delegate_facts: True
      with_items: "{{ groups['all'] }}"
      when:
        - inventory_hostname in groups.get('mons', [])

    # this step allows any host to construct a ceph.conf, for example
    
    - name: everyone gathers facts about monitors
      setup:
      delegate_to: "{{ item }}"
      delegate_facts: True
      with_items: "{{ groups['mons'] }}"
      when:
        - inventory_hostname not in groups.get('mons', [])

I am testing this on a 3-node cluster with a single monitor and all other roles defined on the other 2 nodes, using site.yml, and I verified that no fact gathering was done between pairs of non-monitor hosts, therefore it is now O(N) computational complexity.

background: Guillaume Abrioux provided this example illustrating why ceph-ansible requires remote hosts to have facts about other hosts on hand:

# site.yml
---
- hosts:
    - osds
  gather_facts: true

  roles:
    - role: defaults
    - role: osd
    
# roles/osd/tasks/main.yml
- name: test
  debug:
    msg: "{{ hostvars['mon0']['ansible_all_ipv4_addresses'] }}"

results in output like this:

TASK [osd : test] **************************************************************************************************************************************************************************************************************************************************************
fatal: [osd0]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'ansible_all_ipv4_addresses'\n\nThe error appears to have been in '/Users/guits/GIT/playbook-test/roles/osd/tasks/main.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: test\n  ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'dict object' has no attribute 'ansible_all_ipv4_addresses'"}
fatal: [osd1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'ansible_all_ipv4_addresses'\n\nThe error appears to have been in '/Users/guits/GIT/playbook-test/roles/osd/tasks/main.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: test\n  ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'dict object' has no attribute 'ansible_all_ipv4_addresses'"}

guits added a commit that referenced this issue May 3, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue May 3, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue May 3, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue May 3, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue May 3, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue May 3, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue May 4, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
@bengland2
Copy link
Contributor Author

The method in the 2nd comment seems to work. I do not totally understand why it works yet, particularly with respect to ceph.conf.j2, which has all sorts of references to vars in other groups besides 'mons'. Guillaume has proposed a much smaller patch that may accomplish the same thing - what does "run_once" addition actually do?

leseb pushed a commit that referenced this issue May 4, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
@guits
Copy link
Collaborator

guits commented May 4, 2018

@bengland2 I think I was a bit confused with facts gathering topic but after some tests, it appears as soon as you've gathered fact 1 time for one node, it's available from any other node.

run_once: true makes the task running only 1 time.
It's just nicer than having something like when: inventory_hostname == groups['all'] | first which would produce a lot of 'skipping' in the log and I think it's better to use run_once: true in case of using --limit

leseb pushed a commit that referenced this issue May 4, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 75733da)
Signed-off-by: Sébastien Han <seb@redhat.com>
leseb pushed a commit that referenced this issue May 10, 2018
there is no need to gather facts with O(N^2) way.
Only one node should gather facts from other node.

Fixes: #2553

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 75733da)
Signed-off-by: Sébastien Han <seb@redhat.com>
@bengland2
Copy link
Contributor Author

@guits @jtaleric @fultonj I tried it with 80 computes, 4 OSDs and 3 mons using ceph-ansible-3.1.0-0.1.rc3.el7cp.noarch and ansible 2.4.3. Memory consumption for this task dropped to near zero, very cool. But it still takes 10 minutes to get through this one fact gathering task, not as cool. I think it's doing 1 host at a time. I'm leaving the issue closed because the memory problem and O(N^2) behavior is fixed, maybe in a future release I'll take another look at the speed aspect. At least we now have a test methodology.

guits added a commit that referenced this issue May 24, 2018
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
leseb pushed a commit that referenced this issue May 24, 2018
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
leseb pushed a commit that referenced this issue May 24, 2018
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8288480)
Signed-off-by: Sébastien Han <seb@redhat.com>
guits added a commit that referenced this issue May 24, 2018
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8288480)
Signed-off-by: Sébastien Han <seb@redhat.com>
guits added a commit that referenced this issue Jun 5, 2018
this is kind of follow up on what has been made in #2560.
See #2560 and #2553 for details.

Closes: #2708

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
guits added a commit that referenced this issue Jun 5, 2018
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8288480)
leseb pushed a commit that referenced this issue Jun 6, 2018
Since we fixed the `gather and delegate facts` task, this exception is
not needed anymore. It's a leftover that should be removed to save some
time when deploying a cluster with a large client number.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8288480)
leseb pushed a commit that referenced this issue Jun 6, 2018
this is kind of follow up on what has been made in #2560.
See #2560 and #2553 for details.

Closes: #2708

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
leseb pushed a commit that referenced this issue Jun 6, 2018
this is kind of follow up on what has been made in #2560.
See #2560 and #2553 for details.

Closes: #2708

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 232a16d)
Signed-off-by: Sébastien Han <seb@redhat.com>
leseb pushed a commit that referenced this issue Jun 6, 2018
this is kind of follow up on what has been made in #2560.
See #2560 and #2553 for details.

Closes: #2708

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 232a16d)
Signed-off-by: Sébastien Han <seb@redhat.com>
guits added a commit that referenced this issue Jun 7, 2018
this is kind of follow up on what has been made in #2560.
See #2560 and #2553 for details.

Closes: #2708

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 232a16d)
Signed-off-by: Sébastien Han <seb@redhat.com>
guits added a commit that referenced this issue Jun 7, 2018
this is kind of follow up on what has been made in #2560.
See #2560 and #2553 for details.

Closes: #2708

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 232a16d)
Signed-off-by: Sébastien Han <seb@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants