journal: provide a way to distinguish between stdout and stderr streams #6599

fabiokung · 2017-08-12T00:54:09Z

Fix #5019 by adding a new _SOURCE=stdout|stderr field to log entries from units using StandardOutput=journal and StandardError=journal.

This allows readers of the log to filter different streams from the same unit. A small difference from what's described on #5019 is that StandardError=inherit can force the stderr stream to have the same _SOURCE=stdout from the stdout stream. This makes more sense to me, since it means exactly what it says. In order to have different _SOURCE fields from the same unit, users can use:

StandardOutput=journal[+console]
StandardError=journal[+console]

Also, start versioning the protocol used for stdout streams, making it easier to add new fields in a backwards compatible way in the future. All the changes here should be backwards compatible (e.g.: an older systemd-cat binary sending logs to a newer journald daemon), but I couldn't find a good way to make the existing protocol (I called it v0) forwards compatible with the changes (e.g.: a newer systemd-cat sending messages to an older journald daemon).

How much do we care about forwards/backwards compatibility? Or do we expect daemons + all systemd binaries to be always upgraded together? I recently noticed the journal on-disk format not being compatible with older versions of journalctl, which gave me the impression that we don't need to care too much about compatibility.

Before going this direction, I also explored a few alternatives of implementing #5019:

Use a different socket for stderr streams, but that seemed overkill since the existing socket supports multiple streams and can be configured by a simple protocol including a line-separated header.
Hijack one of the existing fields, such as _TRANSPORT= or _SYSLOG_IDENTIFIER. I started with _TRANSPORT, but quickly realized it would completely change its existing semantics and potentially break people's current assumptions. _SYSLOG_IDENTIFIER seemed equally dangerous, I considered appending a stdout|stderr suffix to it, but it has been the process name for too long already. I'm sure people are already depending on this somewhere.

As part of adding a new field, I created a new v1 version of the stdout stream protocol, that ignores unknown fields up to a termination marker (magic string). I don't particularly love it, but it was the simplest and safest way I found to make it a bit more future proof.

Final note that I didn't change systemd-cat (and sd_journal_stream_fd(3)) to make use of the new field (and splitting stdout and stderr of the exec'd process), but that can be easily done in a future PR if anyone is interested.

/cc @boucman @poettering

fabiokung · 2017-08-14T20:23:00Z

CI failures seem unrelated.

evverx · 2017-08-18T02:07:02Z

@fabiokung, thank you for working on this. The feature seems to be useful, but I am a bit concerned that it might encourage people to use StandardOutput=journal and StandardError=journal, because that will double the number of streams used by each daemon and will reduce the number of services which can be handled by journald.

poettering · 2017-08-30T16:30:16Z

Hmm, this change does come at a price: stdout and stderr will most likely not be interleaved properly anymore, as both enter seperate socket buffers and the journald doesn't know which text was used when...

I am not sure if this is reason enough not to do this at all though, in particular if this is opt-in only.

poettering

Looks pretty good, a few comments though

poettering · 2017-08-31T10:57:52Z

src/journal/journald-stream.c

-        STDOUT_STREAM_RUNNING
+        STDOUT_STREAM_RUNNING,
+        STDOUT_STREAM_VERSION,
+        STDOUT_STREAM_DRAIN,
 } StdoutStreamState;


hmm, i am always a bit careful with "version" tags in file formats and network protocols. I very much prefer schemes where we have feature bits of some kind, or tagged fields or so, so that each implementation can pick the features it wants, instead of having to opt-in into a completely new set of features with each version.

that all said, i'd probably just have hacked this into some existing field, after all we just need one or two bits for this. Maybe OR 1024 and 2048 into the priority field? the valid range for that is 0..999, and hence we stay mostly compatibile. if 1024 is set we'd consider it he stdout stream, if 2048 is set we'd consider it the stderr stream, otherwie we don#t know

poettering · 2017-08-31T11:24:22Z

src/journal/journald-stream.c

                return 0;

+        case STDOUT_STREAM_VERSION:
+                r = safe_atoi(p, &version);


accepting a negative version is weird, no?

poettering · 2017-08-31T11:25:30Z

man/systemd.journal-fields.xml

@@ -333,6 +333,16 @@
          </variablelist>
        </listitem>
      </varlistentry>
+
+      <varlistentry>
+        <term><varname>_SOURCE=</varname></term>


I think "_SOURCE" is overly generic a term. Maybe call this _STDIO_STREAM= instead? In that case it's very clear that this has a relationship to stdio...

poettering · 2017-08-31T11:25:45Z

man/systemd.journal-fields.xml

+          <para>Name of the stream that originated the message.
+          It is commonly used to differentiate messages coming
+          from different streams of the same process.
+          E.g.: stdout vs stderr.</para>


stdout/stderr should be enclosed in <literal> I figure

poettering · 2017-08-31T11:28:04Z

Also, needs a rebase on current git, too

poettering · 2017-08-31T11:28:59Z

(and sorry for the late review, many of us have been on vacation/still are)

portante · 2017-09-05T17:52:32Z

With this patch, stdout/stderr would still be considered at the same priority level, right? Would it be worth considering marking one as being more important than the other?

poettering · 2017-09-06T15:43:28Z

With this patch, stdout/stderr would still be considered at the same priority level, right? Would it be worth considering marking one as being more important than the other?

Yeah, such a change would be very welcome

fabiokung · 2017-09-29T15:39:07Z

Apologies for the radio silence here. I've been out myself for the past month.

I'll take a stab at making some of the changes suggested here, and will ping you all again.

poettering · 2017-11-20T18:01:00Z

ping?

fabiokung · 2017-11-20T22:24:54Z

Sorry, I have been busy and distracted with other things. I will try and pick this up during the upcoming quiet time (the holidays), if no one else beats me to it.

fabiokung · 2018-02-04T08:19:24Z

@poettering I finally had a chance to rebase and change some things based on the suggestions. PTAL

The existing protocol is considered as version=0. A version can be sent as part of the header, as a numeric (integer) line right after the stream identifier. If it isn't a number, we fall back to version=0 and treat the second line as the unit id string, keeping backwards compatibility. For versions > 0, ignore all unknown fields at the bottom (end) of the header, until a special STDOUT_STREAM_HEADER_MARKER line is seen, so in the future new lines (fields) can be added to the protocol in a backwards compatible way (allowing forwards compatibility from now on).

Bump the stdout header protocol to v1, so a new "name" field can be (optionally) configured per stream. This is used to differentiate stdout and stderr streams from units (StandardOutput vs StandardError in unit config). Log entries from units using StandardOutput=journal[+console] and StandardError=journal[+console] will have a new _STDIO_STREAM=stdout|stderr field on their log entries in the journal.

Allowing it to configure a different _STDIO_STREAM field (stdout or stderr) in the future when so desired.

fabiokung · 2018-04-25T22:52:21Z

ping

poettering · 2018-04-27T13:06:50Z

hmm did you see the comments about the protocol versioning?

fabiokung · 2018-04-27T16:39:51Z

@poettering yes, I thought about it. My personal preference would be to have the explicit field to make the intent clearer, rather than hijacking an existing field with some magic bits to save some bytes. But if you strongly prefer not doing the version field and hijacking an existing one, that would be fine by me as well.

One advantage of the current way with an extra field, is that it allows for streams with arbitrary names. stdout and stderr just happens to be what is currently being used. It is more flexible, but I will admit I'm not sure if arbitrary names is even a good thing to support.

poettering · 2018-11-13T18:06:01Z

Is there still interest in this PR?

BTW, I came up with an idea so that we can do what you are asking for without losing the "global ordering" property: we can switch to using an AF_UNIX/SOCK_DGRAM on the journald side plus two such sockets connected to it on the client side. When we use recvfrom() on the journald side to read the data we can then distinguish from which of the two sockets it came, by looking at the "struct sockaddr" we get. And since there's only one socket queue used on the receiving side evertyhing will be perfectly ordered.

I implemented such an approach for a different usecase here: https://github.com/poettering/rederr/blob/master/rederr.c and it works quite well.

What's also nice is that this way we can also get the precise pid of the sender each time we get a message, by using the SCM_CREDENTIALS data.

evverx · 2018-11-14T13:03:10Z

@poettering that sounds interesting. I'm wondering if this scheme will somehow help to address the issue that was discussed in #10444 (comment).

fabiokung · 2018-11-18T03:07:30Z

@poettering I don't have an immediate need for the feature anymore, and we are no more running my original patches anywhere.

Your idea/implementation looks clean and promising. The implications of having stdout/stderr connected to a socket are a bit worrisome, but since this will be true for processes connected to a journald (and writing to the journal), it will be probably fine.

IIUC your proposal correctly, it would require a new socket on the server side per client (e.g.: each unit connected to the journal). Any concerns with the max number of FDs and sockets for the journald process?

fabiokung · 2018-11-18T03:10:10Z

or are you planning on keeping a single socket server side, and routing messages using the PID from SCM_CREDENTIALS?

poettering · 2018-12-27T11:31:16Z

IIUC your proposal correctly, it would require a new socket on the server side per client (e.g.: each unit connected to the journal). Any concerns with the max number of FDs and sockets for the journald process?

No, on the journald side there would still be a single socket, but from the sockaddr_un data we get for each incoming datagram packet we can distuingish stdout from stderr...

poettering · 2018-12-27T11:33:49Z

or are you planning on keeping a single socket server side, and routing messages using the PID from SCM_CREDENTIALS?

the SCM_CREDENTIALS stuff is useful to distinguish clients from each other, but the sender sockaddr_un passed in for incoming datagrams is then useful for distinguishing individual streams of a specific client. i.e. recvfrom() after all has the "struct sockaddr" argument after all, and it's just a matter of making use of that...

poettering · 2018-12-27T11:34:24Z

Anyway, since the immediate need for this is gone, let's close this for now. If this comes up again I am sure I'll remember and point people again to this idea, so that the concept can be resurrected.

lpar · 2019-06-04T15:52:23Z

I was just trying to replace some cron tasks with systemd units, and wanted this feature.

The tasks are written in the usual Unix way, to produce output on stdout and errors on stderr. I'd like to extract just the errors via journalctl and report them, but still have everything available for examination via journalctl if needed.

spmfox · 2021-05-19T19:57:29Z

Anyway, since the immediate need for this is gone, let's close this for now. If this comes up again I am sure I'll remember and point people again to this idea, so that the concept can be resurrected.

@poettering Sorry I cant provide any code towards this, however I find myself in the same situation as @lpar on this. As I've been learning to leverage systemd for all the things, a use case would be to use the journalctl -p command to filter stderr from the script run from a unit. I couldn't figure out why my output wasn't showing up as an 'error' in the journal until I found this.

In the container age, it seems all the rage is to just output everything to stdout and stderr and let the orchestration system handle the logging, be that podman - docker - or systemd/journald.

portante · 2021-05-20T01:45:32Z

If kubernetes is going to support journald for proper logging, by having conmon write to journald instead of to disk (today, having conmon write directly to files destabilizes a system), then we need to be sure we can label stderr and stdout error streams separately.

mattpr · 2022-04-05T11:17:15Z

A lot of apps are just writing to stdout/err for info/error log levels and then letting the journal or "something else" handle the log management. Especially true when you have one app instance per core running in parallel behind some load balancing.

Most of these apps are not writing syslog format logs and I'm not sure it makes sense for them to make that adaptation on the app message outputting end (app doesn't know who is consuming logs or how and shouldn't need to in many cases I think). So setting a priority on the app logger side isn't a great option.

If there were a way to specify default priority within the unit file, that might be good enough...

StandardOutput=journal@6
StandardError=journal@3

But it would be really great to be able to distinguish between messages written to stdout and stderr. The specific use-case is that as long as apps have good hygiene about not writing to stderr unless it is important, we can easily write filters to surface and alert on errors. A troubleshooter could then go find those errors in the journal in context along with info-level logs.

dmotte · 2022-12-20T22:10:46Z

Please why did you close this? It would be extremely useful to distinguish between stdout and stderr in journal

kooltux · 2022-12-27T19:33:42Z

Agreed with dmotte. Please explain the reason to close this issue. So many legacy software and scripts just use stdout/stderr to log things correctly with 2 levels of information.
Having a way to map these to syslog levels in a 'per-service' way would be really helpful for many sysadmins :)

audaki · 2023-02-13T09:16:12Z

@poettering

Please consider re-opening this.

The need is still here.

Currently the info (stdout vs stderr) is completely lost and we are debugging a system and can't filter stderr because the app didn't mark the std::cerr messages in any other way. I'm currently looking through the source of the app to find all the messages that go into std::cerr so we have at least some info.

mattpr · 2023-02-23T13:21:20Z

I think if this is going to happen, someone who wants this will need top implement a reasonable solution and open a new PR (this particular PR implementation was rejected).

There was already a suggestion for how to implement this in a clean way also preserving ordering.

I haven't done any C in a long time, and have too much on my plate to get back into it at a level where I would feel comfortable opening a PR on this anytime soon. So it won't be me. But I do keep hoping a C-dev who wants this and has more time can take Poettering's implementation suggestion and open another PR.

As a hack you can wrap your app exec in a script and run stdout/err through sed subprocesses to prepend syslog priority tags to each log line (e.g. <3>I'm an stderr log line) which supposedly the journal will pick up and respect. The journal would consume the stdout/err from the wrapper script with the priority prefixes attached. I haven't tested yet, but it is one possibility we are considering.

Or we may just stop using the journal for any info-level (stdout) logging. That way we know everything in the journal from the apps we run is an error. Going back to log files and logrotate for non-error logs.

poettering · 2023-02-27T12:32:53Z

@poettering

Please consider re-opening this.

The need is still here.

This is a PR and PRs can only be merged if they are brought into shape by someone doing the work. This didn't happen here, so it was eventually closed.

If you are interested in seeing this implemented. Step up, do the work, and we can get this done.

I think everyone fully agrees that the goal here is good. But if noone wants to do the work, then it's never going to happen. And keeping this PR open even though noone cares for it anymore is not the same as "doing the work".

Anyway: please submit a PR and we'll happily review it.

jkrauss82 · 2023-07-31T14:08:06Z

I am baffled that distinguishing between stdout and stderr is currently not implemented in systemd in 2023. This seems like a very basic feature to have for a system-wide logging facility, I wonder why it was decided to not have this built in from the start.

Just posting here because:

The need is still here.

For us currently this is a blocker to adopt journald for logging. We will have to stick to syslog / file logging for now, which is a shame given the advantages systemd and journald bring to the table.

mattpr · 2023-07-31T14:43:22Z

We will have to stick to syslog

@jkrauss82 if your apps are using syslog APIs anyway, you can use sd-daemon logging priority prefixes to get different log levels in the journal. See this answer for an example.

However for apps that just naively log to stdout/err the only option you have is to build a wrapper app that handles launching your app and dealing with translating stdout/err so they are differentiated in syslog/journal.

fabiokung force-pushed the split-stderr branch 3 times, most recently from da54bb1 to 17f0ebf Compare August 14, 2017 20:16

evverx mentioned this pull request Aug 30, 2017

man pages, journald: document the relationship between journald restarts, stdout/stderr log stream termination and SIGPIPE #6620

Closed

poettering requested changes Aug 31, 2017

View reviewed changes

poettering added journal pid1 reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Aug 31, 2017

evverx mentioned this pull request Sep 11, 2017

JOURNAL_STREAM doesn't seem to make sense when two different streams are used #6800

Closed

jlebon mentioned this pull request Sep 27, 2017

libpriv/scripts: prefix scriptlet outputs in journal coreos/rpm-ostree#998

Closed

fabiokung force-pushed the split-stderr branch from 17f0ebf to fb62c7d Compare February 4, 2018 08:12

fabiokung added 3 commits February 4, 2018 00:44

journal: change sd_journal_stream_fd to the header protocol v1

70c235b

Allowing it to configure a different _STDIO_STREAM field (stdout or stderr) in the future when so desired.

fabiokung force-pushed the split-stderr branch from fb62c7d to 70c235b Compare February 4, 2018 08:44

yuwata removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Feb 5, 2018

evverx mentioned this pull request Oct 16, 2018

core: implement per unit journal rate limiting #10308

Merged

poettering closed this Dec 27, 2018

florianjacob mentioned this pull request Jan 5, 2019

journald should distinguish service log msgs written to stdout and stderr #5019

Open

2 tasks

Nukesor mentioned this pull request Jan 23, 2022

Filter log output by stdout/stderr, and time Nukesor/pueue#239

Closed

journal: provide a way to distinguish between stdout and stderr streams #6599

journal: provide a way to distinguish between stdout and stderr streams #6599

Conversation

fabiokung commented Aug 12, 2017 • edited

fabiokung commented Aug 14, 2017

evverx commented Aug 18, 2017 • edited

poettering commented Aug 30, 2017

poettering left a comment

Choose a reason for hiding this comment

poettering Aug 31, 2017

Choose a reason for hiding this comment

poettering Aug 31, 2017

Choose a reason for hiding this comment

poettering Aug 31, 2017

Choose a reason for hiding this comment

poettering Aug 31, 2017

Choose a reason for hiding this comment

poettering Aug 31, 2017

Choose a reason for hiding this comment

poettering commented Aug 31, 2017

poettering commented Aug 31, 2017

portante commented Sep 5, 2017

poettering commented Sep 6, 2017

fabiokung commented Sep 29, 2017

poettering commented Nov 20, 2017

fabiokung commented Nov 20, 2017 via email

fabiokung commented Feb 4, 2018

fabiokung commented Apr 25, 2018

poettering commented Apr 27, 2018

fabiokung commented Apr 27, 2018 • edited

poettering commented Nov 13, 2018

evverx commented Nov 14, 2018

fabiokung commented Nov 18, 2018

fabiokung commented Nov 18, 2018

poettering commented Dec 27, 2018

poettering commented Dec 27, 2018

poettering commented Dec 27, 2018

lpar commented Jun 4, 2019

spmfox commented May 19, 2021 • edited

portante commented May 20, 2021

mattpr commented Apr 5, 2022

dmotte commented Dec 20, 2022

kooltux commented Dec 27, 2022

audaki commented Feb 13, 2023 • edited

mattpr commented Feb 23, 2023

poettering commented Feb 27, 2023

jkrauss82 commented Jul 31, 2023

mattpr commented Jul 31, 2023

fabiokung commented Aug 12, 2017 •

edited

evverx commented Aug 18, 2017 •

edited

fabiokung commented Apr 27, 2018 •

edited

spmfox commented May 19, 2021 •

edited

audaki commented Feb 13, 2023 •

edited