Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track system-wide file descriptor usage #2597

Closed
phemmer opened this issue Mar 29, 2017 · 6 comments
Closed

Track system-wide file descriptor usage #2597

phemmer opened this issue Mar 29, 2017 · 6 comments
Labels
help wanted Request for community participation, code, contribution

Comments

@phemmer
Copy link
Contributor

phemmer commented Mar 29, 2017

Feature Request

Proposal:

Telegraf should report on system-wide (not process-level) file descriptor usage & limit.

Current behavior:

Does not do.

Desired behavior:

Do.

Use case: [Why is this important (helps with prioritizing requests)]

If the number of file descriptors used reaches the max, any attempts by applications to open any new ones will fail. Thus it is an important statistic to monitor.

Now my uncertainty is whether we should monitor additional stuff, and where it should go.
There are a bunch of metrics that are reported along side the file descriptor usage metric: https://www.kernel.org/doc/Documentation/sysctl/fs.txt (everything with -max, -nr, & -state suffixes). Given the number of potential metrics, it might make sense to put these in a new measurement, such as linux_sysctl_fs.
The main argument against this is that other operating systems, such as FreeBSD, also have a system-wide max file descriptor count, but not some of the other stuff. In which case for consistency across platforms, it makes sense to use the system measurement.

@danielnelson
Copy link
Contributor

Seems useful. I have been more interested in my per user limits, because I always seem to hit them first, but these are important too.

I think these are more advanced metrics, so I'm inclined to put them in a new measurement. That way they are not turned on as part of the system measurement and there is no confusion about how to turn them on. It might still be possible for this measurement to have available metrics for more than one operating system though, so long as they don't conflict.

@phemmer
Copy link
Contributor Author

phemmer commented Mar 30, 2017

Doesn't the procstat plugin track limits? Haven't really used it for much, so not sure. If not then I would think it should.

In any case, any preferences/ideas on the input & measurement names?

@danielnelson
Copy link
Contributor

Yeah, it has number of file descriptors, I don't know of a better way to get fd per user other than summing them up by process, so I guess running procstat with user is good.

I'm thinking sysctl_fs as the input/measurement. I'm assuming that this name implies linux only, and that most of the stats are too different across platforms so there is no way we can generalize.

@nhaugo nhaugo added the help wanted Request for community participation, code, contribution label Mar 30, 2017
@nhaugo nhaugo added this to the Future Milestone milestone Mar 30, 2017
@phemmer
Copy link
Contributor Author

phemmer commented Mar 31, 2017

I'm not sure what you mean by per-user. There is no per-user file descriptor limit. Only per-process, and system wide.

As far as the name, I'm fine with that sort of scheme. But we've recently decommissioned all our FreeBSD hosts, or I would have objected much more vocally, as having to select this field from 2 different measurements would be a pain.
But I would vote to name as linux_sysctl_fs. Other OSs have sysctl, so the linux_ prefix makes it unambiguous.

Will start on this. Should be an easy implementation.

@danielnelson
Copy link
Contributor

Oh, I always thought ulimit -Hn and ulimit -Sn showed a user limit. The rest sounds good.

@danielnelson
Copy link
Contributor

Implemented in #2609

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Request for community participation, code, contribution
Projects
None yet
Development

No branches or pull requests

3 participants