Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make unit failure verbose #4367

Closed
wants to merge 1 commit into from

Conversation

utezduyar
Copy link
Contributor

Something entering failed state should be more visible
in the system logs. Therefore, increase the log level
to at least warning.

Something entering failed state should be more visible
in the system logs. Therefore, increase the log level
to at least warning.
@keszybz
Copy link
Member

keszybz commented Oct 13, 2016

I don't think this is warranted. By default logs have level info, so this already is in the logs. Most things log their own failures are error level, and adding an additional warning from systemd would just increase the amount of high-priority messages.

Do you have a specific case where this makes a positive difference?

@utezduyar
Copy link
Contributor Author

Not if the failure is due to a signal. Example following:

2016-09-09T19:22:53.681+10:00 [ NOTICE ] systemd[1]: duolith.service: Main process exited, code=killed, status=10/BUS
2016-09-09T19:22:54.284+10:00 [ NOTICE ] systemd[1]: duolith.service: Unit entered failed state.
2016-09-09T19:22:54.492+10:00 [ INFO ] systemd[1]: duolith.service: Service hold-off time over, scheduling restart.

Here we couldn't catch that something was wrong as long as we filter things with ERROR level.

@keszybz
Copy link
Member

keszybz commented Oct 13, 2016

So yeah, if the process dies because of a signal, and the signal is of the wrong kind (Lennart recently added a function to classify signals, that takes into account whether the processes is oneshot or not), we should bump the log level here to ERROR. It is expected that in those cases the processes did not log anything. But for the other cases, where the processes died "normally", I think the current level of notice is OK.

When this is done, as a second step, it'd be nice to decrease the log level from systemd-coredump. It currently logs the backtrace at ERROR level, but it really doesn't have to, especially considering that the bt can be rather long.

@utezduyar
Copy link
Contributor Author

Don't you think it will create confusion that sometimes we see messages in the higher severity, sometimes lower. This would be hard for someone filtering the logs I believe especially if you don't even collect the logs that are less important than warning due to flash wear out.

I also believe that as a system manager, we shouldn't rely on application informing us about their abnormal exit. I understand your argument though that higher priority messages will increase in this patch's case which is warning level.

@keszybz
Copy link
Member

keszybz commented Oct 17, 2016

Don't you think it will create confusion that sometimes we see messages in the higher severity, sometimes lower.

We do that quite often... most log_fulls in the source tree are for that.

if you don't even collect the logs that are less important than warning due to flash wear out.

True. But OTOH, you could say that if we keep the number of messages at high levels down, the most important ones will stand out. In my mind, the primary use case for systemd is linux distros, where you expect failing applications to log their own errors. And if they don't, fix them. I agree that it's better to have an error from systemd than nothing, but systemd has limited knowledge, and the application can usually provide a much more useful message, so this should be preferred.

@poettering
Copy link
Member

I have implemented @keszybz's suggestion to upgrade the log level when a process dies due to signal now. Let's continue discussion on the new PR I filed about this: #4415. Closing this one in favour of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

3 participants