Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

journal: provide a way to distinguish between stdout and stderr streams #6599

Closed
wants to merge 3 commits into from

Conversation

fabiokung
Copy link
Contributor

@fabiokung fabiokung commented Aug 12, 2017

Fix #5019 by adding a new _SOURCE=stdout|stderr field to log entries from units using StandardOutput=journal and StandardError=journal.

This allows readers of the log to filter different streams from the same unit. A small difference from what's described on #5019 is that StandardError=inherit can force the stderr stream to have the same _SOURCE=stdout from the stdout stream. This makes more sense to me, since it means exactly what it says. In order to have different _SOURCE fields from the same unit, users can use:

StandardOutput=journal[+console]
StandardError=journal[+console]

Also, start versioning the protocol used for stdout streams, making it easier to add new fields in a backwards compatible way in the future. All the changes here should be backwards compatible (e.g.: an older systemd-cat binary sending logs to a newer journald daemon), but I couldn't find a good way to make the existing protocol (I called it v0) forwards compatible with the changes (e.g.: a newer systemd-cat sending messages to an older journald daemon).

How much do we care about forwards/backwards compatibility? Or do we expect daemons + all systemd binaries to be always upgraded together? I recently noticed the journal on-disk format not being compatible with older versions of journalctl, which gave me the impression that we don't need to care too much about compatibility.

Before going this direction, I also explored a few alternatives of implementing #5019:

  1. Use a different socket for stderr streams, but that seemed overkill since the existing socket supports multiple streams and can be configured by a simple protocol including a line-separated header.
  2. Hijack one of the existing fields, such as _TRANSPORT= or _SYSLOG_IDENTIFIER. I started with _TRANSPORT, but quickly realized it would completely change its existing semantics and potentially break people's current assumptions. _SYSLOG_IDENTIFIER seemed equally dangerous, I considered appending a stdout|stderr suffix to it, but it has been the process name for too long already. I'm sure people are already depending on this somewhere.

As part of adding a new field, I created a new v1 version of the stdout stream protocol, that ignores unknown fields up to a termination marker (magic string). I don't particularly love it, but it was the simplest and safest way I found to make it a bit more future proof.

Final note that I didn't change systemd-cat (and sd_journal_stream_fd(3)) to make use of the new field (and splitting stdout and stderr of the exec'd process), but that can be easily done in a future PR if anyone is interested.

/cc @boucman @poettering

@fabiokung fabiokung force-pushed the split-stderr branch 3 times, most recently from da54bb1 to 17f0ebf Compare August 14, 2017 20:16
@fabiokung
Copy link
Contributor Author

CI failures seem unrelated.

@evverx
Copy link
Member

evverx commented Aug 18, 2017

@fabiokung, thank you for working on this. The feature seems to be useful, but I am a bit concerned that it might encourage people to use StandardOutput=journal and StandardError=journal, because that will double the number of streams used by each daemon and will reduce the number of services which can be handled by journald.

@poettering
Copy link
Member

Hmm, this change does come at a price: stdout and stderr will most likely not be interleaved properly anymore, as both enter seperate socket buffers and the journald doesn't know which text was used when...

I am not sure if this is reason enough not to do this at all though, in particular if this is opt-in only.

Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, a few comments though

STDOUT_STREAM_RUNNING
STDOUT_STREAM_RUNNING,
STDOUT_STREAM_VERSION,
STDOUT_STREAM_DRAIN,
} StdoutStreamState;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i am always a bit careful with "version" tags in file formats and network protocols. I very much prefer schemes where we have feature bits of some kind, or tagged fields or so, so that each implementation can pick the features it wants, instead of having to opt-in into a completely new set of features with each version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that all said, i'd probably just have hacked this into some existing field, after all we just need one or two bits for this. Maybe OR 1024 and 2048 into the priority field? the valid range for that is 0..999, and hence we stay mostly compatibile. if 1024 is set we'd consider it he stdout stream, if 2048 is set we'd consider it the stderr stream, otherwie we don#t know

return 0;

case STDOUT_STREAM_VERSION:
r = safe_atoi(p, &version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepting a negative version is weird, no?

@@ -333,6 +333,16 @@
</variablelist>
</listitem>
</varlistentry>

<varlistentry>
<term><varname>_SOURCE=</varname></term>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "_SOURCE" is overly generic a term. Maybe call this _STDIO_STREAM= instead? In that case it's very clear that this has a relationship to stdio...

<para>Name of the stream that originated the message.
It is commonly used to differentiate messages coming
from different streams of the same process.
E.g.: stdout vs stderr.</para>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdout/stderr should be enclosed in <literal> I figure

@poettering
Copy link
Member

Also, needs a rebase on current git, too

@poettering poettering added journal pid1 reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Aug 31, 2017
@poettering
Copy link
Member

(and sorry for the late review, many of us have been on vacation/still are)

@portante
Copy link
Contributor

portante commented Sep 5, 2017

With this patch, stdout/stderr would still be considered at the same priority level, right? Would it be worth considering marking one as being more important than the other?

@poettering
Copy link
Member

With this patch, stdout/stderr would still be considered at the same priority level, right? Would it be worth considering marking one as being more important than the other?

Yeah, such a change would be very welcome

@fabiokung
Copy link
Contributor Author

Apologies for the radio silence here. I've been out myself for the past month.

I'll take a stab at making some of the changes suggested here, and will ping you all again.

@poettering
Copy link
Member

ping?

@fabiokung
Copy link
Contributor Author

fabiokung commented Nov 20, 2017 via email

@fabiokung
Copy link
Contributor Author

@poettering I finally had a chance to rebase and change some things based on the suggestions. PTAL

The existing protocol is considered as version=0. A version can be sent
as part of the header, as a numeric (integer) line right after the
stream identifier. If it isn't a number, we fall back to version=0 and
treat the second line as the unit id string, keeping backwards
compatibility.

For versions > 0, ignore all unknown fields at the bottom (end) of the
header, until a special STDOUT_STREAM_HEADER_MARKER line is seen, so in
the future new lines (fields) can be added to the protocol in a
backwards compatible way (allowing forwards compatibility from now on).
Bump the stdout header protocol to v1, so a new "name" field can be
(optionally) configured per stream. This is used to differentiate stdout
and stderr streams from units (StandardOutput vs StandardError in unit
config).

Log entries from units using StandardOutput=journal[+console] and
StandardError=journal[+console] will have a new _STDIO_STREAM=stdout|stderr
field on their log entries in the journal.
Allowing it to configure a different _STDIO_STREAM field (stdout or stderr)
in the future when so desired.
@yuwata yuwata removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Feb 5, 2018
@fabiokung
Copy link
Contributor Author

ping

@poettering
Copy link
Member

hmm did you see the comments about the protocol versioning?

@fabiokung
Copy link
Contributor Author

fabiokung commented Apr 27, 2018

@poettering yes, I thought about it. My personal preference would be to have the explicit field to make the intent clearer, rather than hijacking an existing field with some magic bits to save some bytes. But if you strongly prefer not doing the version field and hijacking an existing one, that would be fine by me as well.

One advantage of the current way with an extra field, is that it allows for streams with arbitrary names. stdout and stderr just happens to be what is currently being used. It is more flexible, but I will admit I'm not sure if arbitrary names is even a good thing to support.

@poettering
Copy link
Member

Is there still interest in this PR?

BTW, I came up with an idea so that we can do what you are asking for without losing the "global ordering" property: we can switch to using an AF_UNIX/SOCK_DGRAM on the journald side plus two such sockets connected to it on the client side. When we use recvfrom() on the journald side to read the data we can then distinguish from which of the two sockets it came, by looking at the "struct sockaddr" we get. And since there's only one socket queue used on the receiving side evertyhing will be perfectly ordered.

I implemented such an approach for a different usecase here: https://github.com/poettering/rederr/blob/master/rederr.c and it works quite well.

What's also nice is that this way we can also get the precise pid of the sender each time we get a message, by using the SCM_CREDENTIALS data.

@evverx
Copy link
Member

evverx commented Nov 14, 2018

@poettering that sounds interesting. I'm wondering if this scheme will somehow help to address the issue that was discussed in #10444 (comment).

@fabiokung
Copy link
Contributor Author

@poettering I don't have an immediate need for the feature anymore, and we are no more running my original patches anywhere.

Your idea/implementation looks clean and promising. The implications of having stdout/stderr connected to a socket are a bit worrisome, but since this will be true for processes connected to a journald (and writing to the journal), it will be probably fine.

IIUC your proposal correctly, it would require a new socket on the server side per client (e.g.: each unit connected to the journal). Any concerns with the max number of FDs and sockets for the journald process?

@fabiokung
Copy link
Contributor Author

or are you planning on keeping a single socket server side, and routing messages using the PID from SCM_CREDENTIALS?

@poettering
Copy link
Member

IIUC your proposal correctly, it would require a new socket on the server side per client (e.g.: each unit connected to the journal). Any concerns with the max number of FDs and sockets for the journald process?

No, on the journald side there would still be a single socket, but from the sockaddr_un data we get for each incoming datagram packet we can distuingish stdout from stderr...

@poettering
Copy link
Member

or are you planning on keeping a single socket server side, and routing messages using the PID from SCM_CREDENTIALS?

the SCM_CREDENTIALS stuff is useful to distinguish clients from each other, but the sender sockaddr_un passed in for incoming datagrams is then useful for distinguishing individual streams of a specific client. i.e. recvfrom() after all has the "struct sockaddr" argument after all, and it's just a matter of making use of that...

@poettering
Copy link
Member

Anyway, since the immediate need for this is gone, let's close this for now. If this comes up again I am sure I'll remember and point people again to this idea, so that the concept can be resurrected.

@lpar
Copy link

lpar commented Jun 4, 2019

I was just trying to replace some cron tasks with systemd units, and wanted this feature.

The tasks are written in the usual Unix way, to produce output on stdout and errors on stderr. I'd like to extract just the errors via journalctl and report them, but still have everything available for examination via journalctl if needed.

@spmfox
Copy link

spmfox commented May 19, 2021

Anyway, since the immediate need for this is gone, let's close this for now. If this comes up again I am sure I'll remember and point people again to this idea, so that the concept can be resurrected.

@poettering Sorry I cant provide any code towards this, however I find myself in the same situation as @lpar on this. As I've been learning to leverage systemd for all the things, a use case would be to use the journalctl -p command to filter stderr from the script run from a unit. I couldn't figure out why my output wasn't showing up as an 'error' in the journal until I found this.

In the container age, it seems all the rage is to just output everything to stdout and stderr and let the orchestration system handle the logging, be that podman - docker - or systemd/journald.

@portante
Copy link
Contributor

If kubernetes is going to support journald for proper logging, by having conmon write to journald instead of to disk (today, having conmon write directly to files destabilizes a system), then we need to be sure we can label stderr and stdout error streams separately.

@mattpr
Copy link

mattpr commented Apr 5, 2022

A lot of apps are just writing to stdout/err for info/error log levels and then letting the journal or "something else" handle the log management. Especially true when you have one app instance per core running in parallel behind some load balancing.

Most of these apps are not writing syslog format logs and I'm not sure it makes sense for them to make that adaptation on the app message outputting end (app doesn't know who is consuming logs or how and shouldn't need to in many cases I think). So setting a priority on the app logger side isn't a great option.

If there were a way to specify default priority within the unit file, that might be good enough...

StandardOutput=journal@6
StandardError=journal@3

But it would be really great to be able to distinguish between messages written to stdout and stderr. The specific use-case is that as long as apps have good hygiene about not writing to stderr unless it is important, we can easily write filters to surface and alert on errors. A troubleshooter could then go find those errors in the journal in context along with info-level logs.

@dmotte
Copy link

dmotte commented Dec 20, 2022

Please why did you close this? It would be extremely useful to distinguish between stdout and stderr in journal

@kooltux
Copy link

kooltux commented Dec 27, 2022

Agreed with dmotte. Please explain the reason to close this issue. So many legacy software and scripts just use stdout/stderr to log things correctly with 2 levels of information.
Having a way to map these to syslog levels in a 'per-service' way would be really helpful for many sysadmins :)

@audaki
Copy link

audaki commented Feb 13, 2023

@poettering

Please consider re-opening this.

The need is still here.

Currently the info (stdout vs stderr) is completely lost and we are debugging a system and can't filter stderr because the app didn't mark the std::cerr messages in any other way. I'm currently looking through the source of the app to find all the messages that go into std::cerr so we have at least some info.

@mattpr
Copy link

mattpr commented Feb 23, 2023

I think if this is going to happen, someone who wants this will need top implement a reasonable solution and open a new PR (this particular PR implementation was rejected).

There was already a suggestion for how to implement this in a clean way also preserving ordering.

I haven't done any C in a long time, and have too much on my plate to get back into it at a level where I would feel comfortable opening a PR on this anytime soon. So it won't be me. But I do keep hoping a C-dev who wants this and has more time can take Poettering's implementation suggestion and open another PR.

As a hack you can wrap your app exec in a script and run stdout/err through sed subprocesses to prepend syslog priority tags to each log line (e.g. <3>I'm an stderr log line) which supposedly the journal will pick up and respect. The journal would consume the stdout/err from the wrapper script with the priority prefixes attached. I haven't tested yet, but it is one possibility we are considering.

Or we may just stop using the journal for any info-level (stdout) logging. That way we know everything in the journal from the apps we run is an error. Going back to log files and logrotate for non-error logs.

@poettering
Copy link
Member

@poettering

Please consider re-opening this.

The need is still here.

This is a PR and PRs can only be merged if they are brought into shape by someone doing the work. This didn't happen here, so it was eventually closed.

If you are interested in seeing this implemented. Step up, do the work, and we can get this done.

I think everyone fully agrees that the goal here is good. But if noone wants to do the work, then it's never going to happen. And keeping this PR open even though noone cares for it anymore is not the same as "doing the work".

Anyway: please submit a PR and we'll happily review it.

@jkrauss82
Copy link

I am baffled that distinguishing between stdout and stderr is currently not implemented in systemd in 2023. This seems like a very basic feature to have for a system-wide logging facility, I wonder why it was decided to not have this built in from the start.

Just posting here because:

The need is still here.

For us currently this is a blocker to adopt journald for logging. We will have to stick to syslog / file logging for now, which is a shame given the advantages systemd and journald bring to the table.

@mattpr
Copy link

mattpr commented Jul 31, 2023

We will have to stick to syslog

@jkrauss82 if your apps are using syslog APIs anyway, you can use sd-daemon logging priority prefixes to get different log levels in the journal. See this answer for an example.

However for apps that just naively log to stdout/err the only option you have is to build a wrapper app that handles launching your app and dealing with translating stdout/err so they are differentiated in syslog/journal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

journald should distinguish service log msgs written to stdout and stderr