Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not delegate facts on client nodes #2459

Merged
merged 1 commit into from Apr 4, 2018
Merged

Conversation

guits
Copy link
Collaborator

@guits guits commented Mar 21, 2018

This commit is a workaround for
https://bugzilla.redhat.com/show_bug.cgi?id=1550977

We iterate over all nodes on each node and we delegate the facts gathering.
This is high memory consuming when having a large number of nodes in the
inventory.
That way of gathering is not necessary for clients node so we can simply
gather local facts for these nodes.

Signed-off-by: Guillaume Abrioux gabrioux@redhat.com

@guits
Copy link
Collaborator Author

guits commented Mar 21, 2018

jenkins test luminous-ansible2.4-purge_bluestore_osds_container

@leseb
Copy link
Member

leseb commented Mar 22, 2018

I'm ok with the patch but last time I discussed this with @fultonj he said disabling fact gathering didn't change anything. What has changed since?

@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

@leseb this is what I've raised yesterday in a discussion with @andrewschoen
The thing is that we are not sure how they exactly tested. The context is still unclear.
I'm waiting for an env with that reproduced issue so I can test this patch myself.

@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

jenkins test luminous-ansible2.4-purge_bluestore_osds_container

@guits guits changed the title do not delegate facts on client nodes [dnm] do not delegate facts on client nodes Mar 22, 2018
@@ -32,7 +32,7 @@
delegate_facts: True
with_items: "{{ groups['all'] }}"
when:
- delegate_facts_host | bool
- delegate_facts_host | bool or inventory_hostname not in groups.get('clients')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 things:

  1. The logic seems wrong:
    This will delegate facts if delegate_facts_host is true or the host isnt in the clients group. E,g, if you set "delegate_facts_host = false" it will still delegate on non-client hosts.
    The title makes it sound like you want delegate_facts_host and inventory_hostname not in groups.get('clients') - which should delegate if delegate_facts_host is true and the host isn't in the client group.

  2. This has a consequence because if you have a host that is in the clients group and another group (for whatever reason) it won't delegate facts. You may want to check if it is in all the other groups, but not in the clients group, but that makes a long statement.

Maybe a better way would be to remove the "clients" from the hosts: for this task (line 13), then the gather facts task wont run against clients though, you could run that first against clients.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's correct, i'll update this commit.
thanks @andymcc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of removing clients from the hosts line for this play. Perhaps we should make this fact delegation it's own play entirely, not add clients there but then keep them here so that ceph-defaults and ceph-docker-common can be run against them?

I think that would take care of the second concern @andymcc had.

@guits guits force-pushed the skip_gather_facts_for_clients branch 2 times, most recently from 041c123 to 89f543c Compare March 22, 2018 15:07
@guits guits changed the title [dnm] do not delegate facts on client nodes [skip ci][dnm] do not delegate facts on client nodes Mar 22, 2018
@guits guits force-pushed the skip_gather_facts_for_clients branch 2 times, most recently from 89f543c to d5214c3 Compare March 22, 2018 16:27
@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

what about using with_items: "{{ groups['all'] | difference(groups['clients']) }}" ?

here is the result.
We can see client0 isn't gathering any facts and nobody is gathering facts about client0
later client0 will gather his facts here

@andymcc @leseb @andrewschoen

@andrewschoen
Copy link
Contributor

what about using with_items: "{{ groups['all'] | difference(groups['clients']) }}" ?

I think this is getting closer but I have a couple questions/concerns:

  • if clients is not included in the play that delegates facts and a user provides--limit clients the playbook will fail because the client nodes will not know about the rest of the cluster to create a ceph.conf

  • I think the concern about a node being a client and for example a mon might still exist here. If the hostname for a node exists in both client and mon groups then it would not be included in groups['all'] | difference(groups['clients']) right? This means it's facts would not be delegated and it would not be included in the ceph.conf if --limit clients was used. I'm not sure of a good way to fix this problem without big changes to how we create configuration files though.

This does certainly avoid the issue of delegating facts from clients to all other nodes in the cluster though.

@guits guits force-pushed the skip_gather_facts_for_clients branch 3 times, most recently from 4eaf16f to 72372c6 Compare March 22, 2018 17:49
This commit is a workaround for
https://bugzilla.redhat.com/show_bug.cgi?id=1550977

We iterate over all nodes on each node and we delegate the facts gathering.
This is high memory consuming when having a large number of nodes in the
inventory.
That way of gathering is not necessary for clients node so we can simply
gather local facts for these nodes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
@guits guits force-pushed the skip_gather_facts_for_clients branch from 72372c6 to 339bab9 Compare March 22, 2018 17:53
@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

jenkins test luminous-ansible2.4-purge_bluestore_osds_container

1 similar comment
@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

jenkins test luminous-ansible2.4-purge_bluestore_osds_container

@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

if clients is not included in the play that delegates facts and a user provides--limit clients the playbook will fail because the client nodes will not know about the rest of the cluster to create a ceph.conf

as discussed with you on IRC (btw, thanks for your help), this has been addressed by the last commit I've pushed.

I think the concern about a node being a client and for example a mon might still exist here. If the hostname for a node exists in both client and mon groups then it would not be included in groups['all'] | difference(groups['clients']) right? This means it's facts would not be delegated and it would not be included in the ceph.conf if --limit clients was used. I'm not sure of a good way to fix this problem without big changes to how we create configuration files though.

yes, this concern is still present and yes, I'm not sure we can find a fix quickly and without bringing a huge change.

@guits
Copy link
Collaborator Author

guits commented Mar 22, 2018

jenkins test luminous-ansible2.4-purge_bluestore_osds_container

@fultonj
Copy link
Contributor

fultonj commented Mar 23, 2018

Preliminary testing shows that this patch plus moving from ansible 2.4.2 to 2.4.3 seems to reduce the memory consumed by ceph-ansible. There have only been 3 test runs (2.4.2, 2.4.3 and 2.4.3 + this patch) so we should probably do more runs before we conclude it reduces memory by x%.

@guits guits changed the title [skip ci][dnm] do not delegate facts on client nodes do not delegate facts on client nodes Mar 27, 2018
@leseb leseb merged commit 5b73be2 into master Apr 4, 2018
@leseb leseb deleted the skip_gather_facts_for_clients branch April 4, 2018 07:23
@leseb leseb mentioned this pull request Apr 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants