New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Performance Work
base: devel
Are you sure you want to change the base?
WIP Performance Work #64961
Conversation
de82b46
to
d2defe0
Compare
Rather than send task_vars down to worker (massive single-threaded O(config) generation+serialization bottleneck, and in this case filesystem IO -- good luck with NFS), send template-expanded values up from worker to parent (fully parallelized generation and basically free, aside from fiddly code). Take care not to block task scheduling loop while waiting for processed values |
@dw the problem is we need the task_vars in the main proc for a few templated fields (the task name and now the throttle value), so we have to generate them there. I did test just throwing that away and having the worker regenerate them, however that was actually quite a bit slower. Actually just typing that I realize now that since I've moved callbacks to the result side we don't need to template the task name but the throttle field stands :/ |
Given throttle is inherently 1 for each host-task combination (Ansible runs no more than one task per host simultaneously), does it make sense for its template to have host-task binding? Even if host binding is desired, IIRC there is precedent elsewhere to template using only the first matched host, although this might be something from an ancient version. For 1000 hosts, eliminating 999 get_vars() per task would be a big deal, especially with a slow vars configuration. The binding of throttle isn't documented, only that it can be set on a per-task basis, so making it play+task binding and calculating it once might be something that could live in the changelog, if mentioned at all. There is also the option (as for run_once) of verifying throttle is a template before calling get_vars(). For callbacks I use a buffer where events are reordered to ensure on_task_start for the first host is always reported before subsequent callbacks, by delaying them until the worker has reported the template-expanded name. That way plugins always see the 'correct' event ordering Regardless of strategy the important thing is that dreaded get_vars() call, there is no escaping its blast radius |
@dw I will give that a try, though it's a little more complicated for non-linear strategies (though we can cache the task_vars and look them up per uuid or something). I had originally wanted to move the get_vars call to the worker side, but earlier in my testing it was slower than sending them over via file. I'll try it again though as only generating vars once for the (play, task) context instead of per-host may help. |
Also, with static worker sets, we'd have to ship facts/other vars includes to each worker, which may eat up some of the gain. As is, keeping the call to get_vars in the main proc simplifies things. |
I have a replacement vars manager that solves your facts problem and quite a bit more.. it's something that should be upstream. My experiences with fielding proposals and PRs has been abysmal to this point, primarily because most of your communication does not happen in open spaces. Is there some other way I could approach contributing this to you? |
We don't want to merge this code, but disable it until we know ansible/ansible#64961 is working properly. Signed-off-by: Paul Belanger <pabelanger@redhat.com>
recheck |
0f7fcf9
to
065b348
Compare
0f2187f
to
530fe3b
Compare
ci_complete
One of the performance improvements from perf_idea was to make the DataLoader class init with the basedir unfracked. This prevents a LOT of os.path calls later, but it does mean the DataLoader will never deal with relative paths in methods like the dwim searches. Overall this should be better as it removes a potential bug with relative pathing, but it does have implications for testing. These tests were disabled because they're probably not required anymore. They could be shifted to test against the $CWD of the tests but we may not care.
* Temporarily disabled direct use of send_callback as there are serialization issues with several of the objects being sent across the Queues
* args was not set to always post validate?
* Fixing botmeta since we replaced the old plugin loader * Fixing role duplication detection and handling * Cleaning up pep and other sanity issues
This also cleans up some other code, which was used to set the preferred extension type based on the connection plugin.
* Modifies include_vars to use new role data fields
* Fixing templating of loop control fields * Feeding include_vars info back to workers via updates
@jimi-c This PR was evaluated as a potentially problematic PR for the following reasons:
Such PR can only be merged by human. Contact a Core team member to review this PR on IRC: |
This is an experimental branch for testing performance related improvements.
Depends-On: ansible/ansible-zuul-jobs#245