Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use gopsutils for consistent cpu/mem/disk metrics across multiple platforms #233

Closed
wants to merge 6 commits into from

Conversation

discordianfish
Copy link
Member

Hi everyone,

this change introduces a new collector based on https://github.com/shirou/gopsutil. That lib provides platform independent metrics and is used by projects like telegraf etc. With this in place you can monitor a heterogeneous environment and use the same names/dashboards to refer to your metrics, not matter if it's a linux, bsd, osx or windows system.

This change shouldn't break existing systems since it's not removing the existing metrics unless the resulting metric is exactly the same. In a upcoming PR I would remove the redundant metrics from the old collectors.

Before merging this, I'll validate some assumptions and make sure that the exposed metrics are correct.

@brian-brazil
Copy link
Contributor

I think that while unifying these would be nice, it is ultimately a fool's errand as the semantics are different on different OSes. Each has their own meanings for things like cpu time and memory usage - and even if they were the same we don't get to remove any significant amount of code from existing collectors as this only covers a small subset of their functionality.

@discordianfish
Copy link
Member Author

@brian-brazil Yes, there are always specifics to the different OSs, but what gopsutils provide is a good basic set which helps ppl to get started. I really need osx metrics and what gopsutils provide in that regard is a good start. We'll still have a lot of os specific metrics/collectors, but I don't think this should stop us from consistent metrics where possible.

@discordianfish discordianfish force-pushed the fish/use-gopsutils branch 2 times, most recently from 91e9dd8 to 41d1a5c Compare April 28, 2016 21:29
@brian-brazil
Copy link
Contributor

If you want OS X statistics, then propose a change to add OS X statistics.
This change has a much broader scope, and implies at the least breaking changes to memory, swap and network metrics - all of which are core metrics that are widely depended on. I presume that there are further breaking changes on non-Linux OSes.

Gopsutil may have been a good starting point if we were doing the node exporter from scratch, but we're now well beyond the point where delegating this to a 3rd party library would make sense.

@discordianfish
Copy link
Member Author

@brian-brazil It has a broader scope indeed, but I don't understand what's wrong with that.
We should do some breaking changes because the metric names are not following our best practices. For that we have also #150 open. The arguments over there are about how to make things consistent without taking care of some mapping. This is what gopsutils does for us.
For metrics that are already named correctly, nothing will change. Finally we can have the metrics with the old name in parallel for some transition period.

@brian-brazil
Copy link
Contributor

We should do some breaking changes because the metric names are not following our best practices. The arguments over there are about how to make things consistent without taking care of some mapping. This is what gopsutils does for us.

This is a separate argument, unrelated to gopsutil. If you wish to improve the metrics names you should send a PR doing only that, as this PR does not move us in this direction.

You need to justify gopsutil on it's own merits. You're proposing 300+ files worth of abstraction that make it harder to figure out where we're getting metrics from that only save us a few lines of code, and also increases complexity as we need to figure out for each metric whether it should come direct or via gopsutil - and exclude the ones that do individually. For example in netdev as gopsutil seems to lack full support, we'd need a blacklist to avoid duplication.

This is far from a clear win, and to me seems like a net loss for understandability and maintainability.

@discordianfish
Copy link
Member Author

We should do some breaking changes because the metric names are not following our best practices. The arguments over there are about how to make things consistent without taking care of some mapping. This is what gopsutils does for us.

This is a separate argument, unrelated to gopsutil. If you wish to improve the metrics names you should send a PR doing only that, as this PR does not move us in this direction.
It does move us into that direction. All it's metrics follow our best practices (which is also why they don't collide with existing metrics).

You need to justify gopsutil on it's own merits. You're proposing 300+ files worth of abstraction that make it harder to figure out where we're getting metrics from that only save us a few lines of code, and also increases complexity as we need to figure out for each metric whether it should come direct or via gopsutil - and exclude the ones that do individually.

It saves us lots of code if we want to support windows and osx metrics. If we don't, sure - the complexity isn't justified. What is your thought on this in general? I always wanted it to support metrics across platforms.

For example in netdev as gopsutil seems to lack full support, we'd need a blacklist to avoid duplication.

I would try to separate the generic gopsutil collector which provides basic metrics across all platforms, from OS specific collectors. For somethings though we can also just contribute to gopsutil to extend it.

This is far from a clear win, and to me seems like a net loss for understandability and maintainability.

I guess it really depends on the scope. Alternatively we can also duplicate (read: copy&paste) relevant code from gopsutil and include it in the node-exporter. But I'm not sure this makes more sense.

@brian-brazil
Copy link
Contributor

if we want to support windows and osx metrics.

Windows is so different that I expect it to be a completely separate WMI exporter, I expect little to no overlap with the node exporter.

What is your thought on this in general? I always wanted it to support metrics across platforms.

I was under the impression that we already had the basic OS X metrics.

I would try to separate the generic gopsutil collector which provides basic metrics across all platforms, from OS specific collectors.

I'm a bit wary of this. My investigation of some of what are considered basic metrics (particularly for memory) is that there's actually no clear definition. So we may end up with the gopsutil definition rather than something that's a real consistent metric across platforms.

@discordianfish
Copy link
Member Author

Okay, I'm going to close this. We'll use our fork and maintain it independently.

@fabxc fabxc deleted the fish/use-gopsutils branch April 30, 2016 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants