-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate limit duplicate log lines #2326
Comments
One of the basic problems with solving this, and other things (such as tracking error count per-plugin, or starting telegraf in a new namespace via exec (#2087)), is that some of the plugins write errors directly to STDOUT/STDERR. We could do something like opening a pipe and redirecting STDOUT/STDERR into the pipe, but with this we don't know which plugin generated the error, and it prevents us from properly tracking errors per plugin (#1348). We could also force plugins to use The only other option I can think of is to provide a per-plugin logger for plugins to use. We would again redirect STDOUT/STDERR to I think these latter 2 options are the better solutions, as aside from addressing #1348 & #2087, it also makes it much easier for admins to ensure telegraf logs go to syslog. Personally this isn't a problem for me as my systems use systemd, so STDOUT/STDERR is automatically collected into the journal. But for non-systemd users, getting the logs into syslog is a lot harder/messier. @sparrc thoughts? |
I think I'd prefer not to have a per-plugin logger, but could be convinced otherwise if it's really needed. Whatever we do, it needs to reflect influxdata/influxdb#7671 |
So what does closing this mean? The issues mentioned will not be addressed? |
oops, didn't mean to close it |
I believe this is handled by |
No, systemd does not handle this. Systemd will rate limit all log lines, not duplicate log lines. In fact this functionality is needed to prevent the systemd rate limit from suppressing important information. But the primary importance is to prevent spammy plugins from drowning out legitimate messages |
Feature Request
Opening a feature request kicks off a discussion.
Proposal:
Telegraf should rate limit duplicate log entries to prevent the same line from spamming the logs.
Current behavior:
There is no rate limiting on duplicate lines, so telegraf can repeat the same error hundreds of times per second.
Desired behavior:
Telegraf should keep track of what it has logged, and prevent the same log message from being output within an X-second period, where X is configurable.
Note that there should not be a restriction on the duplicate lines being consecutive to be suppressed. Meaning the following should not happen:
Use case: [Why is this important (helps with prioritizing requests)]
When errors occur in some of the plugins, they can be extremely repetitive in their messages. Sometimes generating hundreds of the same messages per second. This can drown out other more quiet errors. It can also result in log destinations being saturated, or filled up (e.g. bandwidth or disk space).
The text was updated successfully, but these errors were encountered: