-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hostmetricsreceiver: use gopsutil cached boot time #32126
hostmetricsreceiver: use gopsutil cached boot time #32126
Conversation
This PR enables `gopsutil`'s cached boot time feature. The primary purpose is to reduce the CPU usage of the `process` and `processes` scrapers, which read the boot time vastly more times than necessary. Also added the `hostmetrics.process.bootTimeCache` featuregate which is enabled by default. Disabling it will return the `process` scraper to a similar functionality of reading the boot time at the start of every scrape (but still won't read it as much as it used to).
Working on changelog and documentation, wanted to get it started running tests early. |
receiver/hostmetricsreceiver/internal/scraper/processscraper/process_scraper.go
Show resolved
Hide resolved
host.EnableBootTimeCache(false) | ||
_, err := host.BootTimeWithContext(ctx) | ||
if err != nil { | ||
errs.AddPartial(1, fmt.Errorf(`retrieving boot time failed with error "%w", using cached boot time`, err)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm till not sure why we need it. I'd prefer to keep the Alpha state as it is right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean keeping the alpha state of the featuregate as the old behaviour of not caching anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I don't want to do that is because I want to make it so that with the featuregate on or off, the performance is still down to a more normal level. The code here ends up with what is essentially the same setup as before, but reading the boot time once as opposed to (number of processes) * 4. This has basically the same performance benefit as caching the boot time for the whole run of the process, but maintains functionally the same old behaviour. I mentioned in the comment on the issue that I was going to have the caching turned on by default, and when the featuregate was disabled it would do this. Should I invert it so the featuregate is alpha and we use this behaviour instead by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that one feature gate controls two new behaviors even if one of them doesn't affect the user experience. But I don't think it's worth debating too much. So we can merge it in this state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave a note here and in docs what is needed as a user by cfg or as a developer by "ocb build" to achieve the minimum cpu load by max caching, because i already feel a bit lost how will be able to make use of it based on this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cforce Nothing will be needed in config or build to get the maximum caching. Caching the boot time for the entire lifetime of the collector will be the new default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to build prerelase of 0.99 with ocb , which XXXX version/tag/branch name shall i use?
- gomod: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver XXXX
is it ?
- gomod: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver@main
version: latest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I need to get a commit-specific version, I go to the component I care about and get the latest commit hash, then run go list -m <module>@<commit>
which gives me the module pseudo-version.
$ go list -m github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver@fcf1bee
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver v0.98.1-0.20240416201607-fcf1bee48590
So based on that in ocb you'd do:
- gomod: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver v0.98.1-0.20240416201607-fcf1bee48590
I've never tried mixing versions between components in an ocb config; you might need to use that version for everything if I'm not mistaken.
now released in otecol 0.99.0 👍 - excellent |
) **Description:** This PR enables `gopsutil`'s cached boot time feature. The primary purpose is to reduce the CPU usage of the `process` and `processes` scrapers, which read the boot time vastly more times than necessary. Also added the `hostmetrics.process.bootTimeCache` featuregate which is enabled by default. Disabling it will return the `process` scraper to a similar functionality of reading the boot time at the start of every scrape (but still won't read it as much as it used to). **Link to tracking Issue:** open-telemetry#28849 **Testing:** No tests were added. Ran the collector to ensure that everything still functioned as expected with the new functionality.
Description:
This PR enables
gopsutil
's cached boot time feature. The primary purpose is to reduce the CPU usage of theprocess
andprocesses
scrapers, which read the boot time vastly more times than necessary.Also added the
hostmetrics.process.bootTimeCache
featuregate which is enabled by default. Disabling it will return theprocess
scraper to a similar functionality of reading the boot time at the start of every scrape (but still won't read it as much as it used to).Link to tracking Issue:
#28849
Testing:
No tests were added. Ran the collector to ensure that everything still functioned as expected with the new functionality.
Documentation:
TODO