-
Notifications
You must be signed in to change notification settings - Fork 2.6k
ethtool: Prevent duplicate metric names #2187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Sanitizing the metric names can lead to duplicate metric names: ``` caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer prometheus#2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values" ``` Generate a map from the sanitized metric names to the metric names from ethtool. In case of duplicate sanitized metric names drop both metrics, because it is unknown which one to take. Fixes: prometheus#2185 Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
00d3c78 to
865db3b
Compare
|
LGTM |
|
Not sure if we should drop them or add an index or something to make them unique. |
|
The driver in question, vmxnet3 (that's the VMware paravirt NIC driver) makes a creative "visual indent" hierarchy of ethtool metrics. I just checked one of my systems that has multi-queue enabled and it's worse than I thought. Not only are there two "giant hdr" entries per queue, but the individual metrics themselves are not uniquely named per queue. This is trying to show a hierarchical relationship, so the various "giant hdr" metrics are really more like We could attempt to reconstruct that hierarchy by counting leading spaces. We could add an index number, but we'd end up with "giant hdr number 1", "giant hdr number 2" for queue 0, and "giant hdr number 3" and "giant hdr number 4" for queue 2? |
|
The unfolding of the indentation should be done in the ethtool library IMO. |
|
Yeah ideally the ethtool lib takes care of parsing this as a hierarchy. I'm quite surprised though that these syscalls are returning indented strings? But until this happens, the best we can do is either drop duplicates or add an index. I think we'll need to do the later since otherwise it's simply impossible to get the otherwise drops metrics. Thoughts? |
|
I prefer dropping it, because adding an index makes the metric name fragile IMO. Is "giant hdr number 2" the giant hdr from Tx Queue number 1 or did the queue change? |
|
I'd need to be a stable order, sure. But this seems better than not being able to monitor "giant hdr number 2" at all, no? |
|
We're going to have to drop the duplicates for now until I can get the ethtool library to do something useful here. The Stats() call returns a As to why the syscall is returning indented strings, they're hardcoded in the driver. |
|
Ah got it, well then this LGTM. |
|
@SuperQ PTAL |
Sanitizing the metric names can lead to duplicate metric names: ``` caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values" ``` Generate a map from the sanitized metric names to the metric names from ethtool. In case of duplicate sanitized metric names drop both metrics, because it is unknown which one to take. Fixes: prometheus#2185 Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Sanitizing the metric names can lead to duplicate metric names: ``` caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values" ``` Generate a map from the sanitized metric names to the metric names from ethtool. In case of duplicate sanitized metric names drop both metrics, because it is unknown which one to take. Fixes: prometheus#2185 Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Sanitizing the metric names can lead to duplicate metric names:
Generate a map from the sanitized metric names to the metric names from ethtool. In case of duplicate sanitized metric names drop both metrics, because it is unknown which one to take.
Fixes: #2185