-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(inputs.infiniband): Handle devices without counters #14049
Conversation
Thanks so much for the pull request! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
Looking at the issue, it seems that there are devices which may report as infiniband capable, won't have counters, and right now this prevents any metrics. Continuing on allows devices that can correctly get collected report metrics.
Does that sound right?
!signed-cla |
Almost :)
|
original behavior (metric collection fails on device
after suggested patch (metric collection skips
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
It seems the only time errors come from this function are when the file does not exist or we cannot read the file itself, so I think this makes sense that we only return what we can read.
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 👍 This pull request doesn't change the Telegraf binary size 📦 Click here to get additional PR build artifactsArtifact URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @jose-d!
(cherry picked from commit 360eeec)
resolves #8135
Not all rdma devices have counters parsable by
GetRdmaSysfsStats
.This PR add
continue
statement to ignore such device (and continue with next device/port) instead of throwing an error and failing