Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve add_facts performance #3292

Merged
merged 1 commit into from
Apr 16, 2024

Conversation

seanmil
Copy link
Contributor

@seanmil seanmil commented Apr 2, 2024

When making multiple add_facts calls to add facts to an already present and large set of target facts performance declines as more are added due to the initialization and copying of the large hashes each time.

This change merges the added facts in place to the existing facts, which slightly improves performance.

@seanmil seanmil requested a review from a team as a code owner April 2, 2024 16:46
lib/bolt/util.rb Outdated Show resolved Hide resolved
@donoghuc
Copy link
Member

donoghuc commented Apr 2, 2024

seems like a good improvement. Can you update commit message to be formatted with https://github.com/puppetlabs/bolt/blob/main/CONTRIBUTING.md#pull-requests style that will help our docs automation?

I'll need to dig in to the failing tests, it appears to be unrelated to this work.

@seanmil
Copy link
Contributor Author

seanmil commented Apr 2, 2024

I will fix this up and re-push, but just so you know - the original symptom of VERY slow add_facts calls which sent me down this path seem to have actually been due to the Bolt analytics associated collected for that call. In the scenario I was testing I have some 3,300 targets which I'm trying to add 6 specific facts from PuppetDB each to (the whole set of facts is unneeded and would be way too much) - which results in a little more than 20,000 add_facts calls. It seems when analytics is enabled it just falls over at some point when processing them. For a smaller but still large number of calls it just makes the add_facts calls start to take progressively (and significantly) more time to return. When I added analytics: false to the bolt-project.yaml it immediately started working and the performance was quite acceptable.

I was misled in my troubleshooting because it seems that analytics is not enabled when running bolt from a Git checkout instead of from the normal package. So, this change is probably still a good idea, but is quite incremental vs. revolutional as I had initially thought.

@donoghuc
Copy link
Member

donoghuc commented Apr 2, 2024

Thanks for reporting that. I've filed a bug issue and I intend on disabling analytics submissions by default moving forward. #3293

When making multiple add_facts calls to add facts to an already
present and large set of target facts performance declines
as more are added due to the initialization and copying of
the large hashes each time.

This change merges the added facts in place to the existing
facts, which provides a slight performance improvement.

!feature

* **Minor add_facts optimization**

  Deep merge new facts into existing target fact hash, instead
  of creating and copying a new hash each time.
@donoghuc donoghuc merged commit 00ce3bd into puppetlabs:main Apr 16, 2024
92 of 130 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants