Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] core: add Unit.SilentOnSuccess config item #14279

Closed
wants to merge 1 commit into from
Closed

[WIP] core: add Unit.SilentOnSuccess config item #14279

wants to merge 1 commit into from

Conversation

rhendric
Copy link
Contributor

@rhendric rhendric commented Dec 7, 2019

The target use case for this feature is a service that runs on a timer
many times an hour, where the system administrator decides that writing
a generic success message to the journal every few minutes or seconds
adds no diagnostic value and isn't worth the clutter or disk I/O.


This is an attempt to close #9328. There are some design decisions here that could easily go a different way:

  • Should SilentOnSuccess change the generic success messages' level from info to debug instead of eliding them altogether?
  • Should there be multiple settings here? (@eMPee584 proposed two items, SilentOnSuccess and SilentStartup, which got a few 👍s from supporters; I didn't see a use for two separate items but if presented with one I could see myself being swayed.)
  • I'm far from an expert on systemd, and in particular I don't understand the principles behind what config items go in which sections—I made it a [Unit] config item because that seemed most straightforward, but is that correct?

This is marked work-in-progress because I'm really submitting it in the hope that a concrete implementation might move the conversation in #9328 forward some, which seems likely to result in some changes to requirements. Very open to changing any/all of this design as long as we end up with some way to configure a timer'd service to refrain from spamming the journal with generic success messages.

The target use case for this feature is a service that runs on a timer
many times an hour, where the system administrator decides that writing
a generic success message to the journal every few minutes or seconds
adds no diagnostic value and isn't worth the clutter or disk I/O.
@Jajcus
Copy link

Jajcus commented Dec 9, 2019

Should SilentOnSuccess change the generic success messages' level from info to debug instead of eliding them altogether?

That would do only if we were able to make journald stop storing those.

Should there be multiple settings here? (@eMPee584 proposed two items, SilentOnSuccess and SilentStartup, which got a few +1s from supporters; I didn't see a use for two separate items but if presented with one I could see myself being swayed.)

One setting seems ok if it suppresses both messages ('starting' and 'started'), though the 'SilentOnSuccess' name may suggest it only affects the latter (which is not a big problem for me).

I'm far from an expert on systemd, and in particular I don't understand the principles behind what config items go in which sections—I made it a [Unit] config item because that seemed most straightforward, but is that correct?

Feels more like a [Service] item to me… though, I am not really sure. [Service] won't be right if that change also affects units other than '.service'.

@boucman
Copy link
Contributor

boucman commented Dec 9, 2019

remember that we also have journal-namespace in the pipe...

With journal namespaces, you can have your unit have its own journald and thus store its message separately, and with a different filetring level (since that's configurable in journald)

@rhendric
Copy link
Contributor Author

One setting seems ok if it suppresses both messages ('starting' and 'started'), though the 'SilentOnSuccess' name may suggest it only affects the latter (which is not a big problem for me).

Ah, I see... the services with which I'd be interested in using this feature already don't emit ‘starting’ messages, but of course different service types would. Thanks, now the desire for a second setting makes sense. The current patch doesn't do anything about ‘starting’, only ‘started’.

Feels more like a [Service] item to me… though, I am not really sure. [Service] won't be right if that change also affects units other than '.service'.

This patch does apply to other unit types—if present, SilentOnSuccess suppresses the "Mounted %s.", "Activated swap %s.", "Found device %s.", "Set up automount %s.", "Created slice %s.", "Listening on %s.", "Reached target %s.", and "Started %s." messages associated with successful starts of mount, swap, device, automount, slice, socket, target, and service units respectively.

I can't think of any reason you'd want that for any of those types other than service—who has a setup where the log noise coming from mounts is a concern?—but this was the simplest implementation, as the logging code that needed to be skipped was deep in the generic unit handling functions. And perhaps I'm missing a valid use case for other unit types. But I do kind of agree that it makes more sense as a service-only option. (Maintainers: if this is desirable, I see two paths to achieving that from the current patch: one is to thread an extra parameter into the call graph between unit_notify and job_finish_and_invalidate; the other is to ‘cheat’ in load-fragment-gperf.gperf.m4 and just move the option to the Service section without moving the field from the Unit struct, as with the legacy Service.StartLimitInterval and company. Which is preferable?)

@poettering
Copy link
Member

I am not convinced this is really desired. Logging is a good thing, including for repetitive stuff... Filter the output on display, not when collecting the data. There's a good chance you want to know if the timer unit ran whenver you configured it to run, and the way to figure out is … well … the log data.

@Jajcus
Copy link

Jajcus commented Dec 10, 2019

This make a big difference if I have a month of useful logs on a given system or just a few days, mostly of useless 'timed service started' messages. Different systems have different constraints. Sometimes disk space is a problem, sometimes amount of disk I/O. Systemd journal is a really convenient logging system and it is great to use it for logging of the actual work done by applications.

Sending everything to journald would be ok for me if I could filter it there before storing it. Currently filtering is only available on display (journalctl) and that is to late (resources already wasted). The journald namespace seems promising, but it is not exactly that.

Now I sometimes need to run a separate service doing scheduled tasks, even though I would prefer to use the convenience of systemd timer units, just to prevent excessive logging. Even crond was less chatty.

@rhendric
Copy link
Contributor Author

Filter the output on display, not when collecting the data.

I agree that it's good to err in that direction. If this were an absolute principle, though, there'd be no LogLevelMax. I see this setting as a refinement of that one—in both cases, the default behavior remains to log everything and let the consumer sort it out, but for sysadmins who know their systems and their requirements, less important messages can be prevented from consuming any resources at all on a unit-by-unit basis. The participants in #9328 and here are just voicing that not all ‘started successfully’ messages are of equal importance.

Do you have a principle behind why LogLevelMax is desired but SilentOnSuccess may not be?

@lilalkor
Copy link

lilalkor commented Mar 8, 2020

It is definitely needed one. Speaking about debugging your timer - if you specifically told it to be silent, and want to debug, you can enable the output againg. And anyway, if your script or something you are starting with it provides no output or correct exit code, you won't get any use of it now. And if you know what you are doing, you should be able to control the logging.

Right now it forces people with limited disk space or IO to ignore systemd timers, install some cron daemon and use it, because with cron you can control, what to log and what to ignore.

@r7l
Copy link

r7l commented Mar 12, 2020

Given that systemd claims to be the successor to allot of things in LInux, it's hard to believe that this vital feature is still not existing. When using any random Cron application i've used over the years, it was and still is possible to mute logging:

Just add: > /dev/null 2>&1

In my case right now, i have a script that i want to call every 10 seconds. Most of the time this means, the systems journal will have dozens of lines about this single timer before i see anything else. So i am forced to install a 3rd party Cron to keep my journal from being clogged up with spam.

@medhefgo
Copy link
Contributor

Imho, the correct approach here would be to (finally) add filter+redirection support to journald instead of this very narrow solution. I think this is very much needed in a world where everything is supposed to just log and forget. Some applications/workflows can be very chatty but with little use to you.
And one reason I find this problematic is that any verbose logging from one app/service can trigger log rotation right now which could evict important messages that you actually do care about.

Also, considering that the journal gets structured input, filtering and redirection output should be easy to add with a very simple rule-set. Think of something that would let you filter on any journal field name as well as their contents. Both with the usual comparison operators as well as regex matching.
Add to that the ability to define different journaling targets so you could send every message to runtime journal and a short-lived (low storage quota) persistent journal. As well as have only important messages (however the rule-set defines those) into a important journal that gets more disk space allocated.

@rhendric
Copy link
Contributor Author

Unless his thinking has evolved, @poettering seems to be of the mind that sources should filter their log output, not journald.

#2447 (comment):

Why would you even send data to journald if you don't want it to stay in memory or on disk? If you want to suppress log data, why not attack it at the source of the problem, instead of turning the destination in to selective null sink?

Ergo, if we don't want certain messages from the systemd process to fill up our disks, we should probably convince him that systemd should be configurable not to log those messages, even if a broader solution in journald would cover more use cases.

@aldem
Copy link

aldem commented May 20, 2020

There's a good chance you want to know if the timer unit ran whenver you configured it to run

There is a good chance that we want to know that it didn't run when it should, but it would be really, really difficult to find that it didn't run once when it is normally running every 5 seconds and we have to sift through all this noise to pinpoint exact place where couple lines (looking exactly like the rest) are absent for more than 5 seconds.

Logs with tons of starting/started/stopped every second are taking space, quickly wears flash and consume I/O, while in most (regular usage) cases are useless. If there is a problem somewhere, usually we do have monitoring anyway, and only when really needed it makes sense to increase verbosity and to start deep inspection.

Base automatically changed from master to main January 21, 2021 11:54
@krutztq
Copy link

krutztq commented Jan 29, 2021

+1 here, giving my vote for ability to suppress such messages.

I wanted to use a small service with a timer on an embedded device. It has a small small flash size and I have to keep an eye on wear leveling. I just want to suppress the start/stop messages of this special service to log only error messages of it.

For now I have to work around.

@poettering
Copy link
Member

If there's still interest in this. I'd be fine if the existing LogLevelMax= is updated to also apply to messages logged by PID 1 about the unit, instead of only the stuff logged from the unit's own processes. Then you could do LogLevelMax=notice in your unit, and informational messages would disappear from the logs, if that's what you want, only leaving notice/warning/error message around.

@krutztq
Copy link

krutztq commented Mar 23, 2021

If there's still interest in this. I'd be fine if the existing LogLevelMax= is updated to also apply to messages logged by PID 1 about the unit, instead of only the stuff logged from the unit's own processes. Then you could do LogLevelMax=notice in your unit, and informational messages would disappear from the logs, if that's what you want, only leaving notice/warning/error message around.

This would only work for devices with a corresponding log level (e.g. fully developed devices). This does not work in a beta test, which only needs certain services with debug level.

@poettering
Copy link
Member

This would only work for devices with a corresponding log level (e.g. fully developed devices). This does not work in a beta test, which only needs certain services with debug level.

I can't parse this.

Anyway, closing this, let's continue in #19050

@poettering poettering closed this Apr 28, 2021
@socketpair
Copy link

Finally:

[Service]
LogLevelMax=....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Impossible to suppress start/stop log output for timers?
10 participants