-
Notifications
You must be signed in to change notification settings - Fork 23.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perfy McPerferton #58400
Perfy McPerferton #58400
Conversation
bot_skip |
Waiting on the results of another perf run to validate some changes. |
Perf comparison between this PR and devel:
This is 35k hosts, 10 ping tasks, 25 forks. With a single ping task:
|
…s in the list comprehension
…esh in get_hosts_remaining and get_failed_hosts for backwards compat
Leaving this in WIP right now, but it's ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, just those couple of little things. Yay performance!
|
||
if pattern_hash not in self._hosts_patterns_cache: | ||
|
||
patterns = split_host_pattern(pattern) | ||
hosts = self._evaluate_patterns(patterns) | ||
hosts[:] = self._evaluate_patterns(patterns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming there's some empirically-derived performance benefit to the list replacement instead of reassignment, but it's not obvious to me what it is... Probably deserves a comment over this section if that's the case so any future changes propagate the pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's one of those things that I have brought along with me over the years. It's every so slightly faster, and produces at minimum 1 less malloc. Honestly, my original commit for this said: negligible perf improvement
@@ -1007,6 +1019,7 @@ def _evaluate_conditional(h): | |||
if task.when: | |||
self._cond_not_supported_warn(meta_action) | |||
self._inventory.refresh_inventory() | |||
self._set_hosts_cache(iterator._play) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one's probably also suspect, though in this case I think we want to remove hosts but not add new ones. It really depends on how we expect plays to behave when hosts appear/disappear mid-play, which doesn't seem to be perfectly defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one is correct. inventory.refresh_inventory()
fully clears the inventory and re-populates it. A call to get_hosts
would have then re-populated based on the new state of the inventory which would have both added new hosts from inventory sources and removed hosts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be correct in that it matches what was happening before, but there seemed to be some disagreement about what exactly should be in the ansible_play_hosts
vars when the inventory changes mid-play. Preserving the pre-Perfy behavior is probably best for now, but it's probably a decision we should make explicitly (and if we were designing it all from scratch, I'd argue that the vars should reflect what's actually happening in the current play, which they sometimes don't today). I tripped over several "unintuitive" dynamic inventory behaviors while testing this stuff (most of which existed previously)- we don't need to correct them here, but I also don't want to forget about them, so if we're not going to fix them, I'll open separate issues for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the potential of unintended consequences of this PR already, I'd like to limit behavior changes here if possible.
@@ -141,7 +141,8 @@ def extra_vars(self): | |||
def set_inventory(self, inventory): | |||
self._inventory = inventory | |||
|
|||
def get_vars(self, play=None, host=None, task=None, include_hostvars=True, include_delegate_to=True, use_cache=True): | |||
def get_vars(self, play=None, host=None, task=None, include_hostvars=True, include_delegate_to=True, use_cache=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was also some discussion about whether we could somehow hang the cache off the play object instead of adding the new optional args everywhere. Left as an exercise for the reader, but the new plumbing for this is the "ickiest" part of this change IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't really disagree. I'm not necessarily a huge fan of these extra args either, but it didn't feel right hanging it off the play either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With conscious decision to preserve existing behaviors around dynamic inventory (and those couple of fixes in to match existing behavior), +1 to merge in current state
SUMMARY
This PR aims to further advance performance improvements with inventories consisting of thousands of hosts.
ISSUE TYPE
COMPONENT NAME
many
ADDITIONAL INFORMATION
Most perf testing was done using a set of 35k hosts (localhost aliases), 1-10 ping tasks, and 25 forks.
cProfile provided much of the inspiration of fixes. Many are micro performance improvements that you will only see with a huge number of hosts.
A few things this PR does:
SharedPluginLoaderObj
to avoid repeated instantiationget_hosts
so frequently, cache the results and manipulate the cache for the lifetime of the Strategy pluginget_vars
and used by_get_magic_variables
so that we aren't recalculatingget_hosts
repeatedly in that code path when invoked from the Strategy (Can be extended further later if needed)