Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

journald is unable to attribute messages incoming from processes that exited to their cgroup, due to /proc vs SCM_CREDS race #2913

Open
1 task done
oconnor663 opened this issue Mar 30, 2016 · 33 comments
Labels
bug 🐛 Programming errors, that need preferential fixing journal

Comments

@oconnor663
Copy link

  • Bug report
systemd 229
+PAM -AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

[Arch Linux] It looks like journalctl --user-unit=... is dropping log lines that happen immediately before a user job exits. Note that I was only able to repro this with user jobs; system jobs don't seem to have this problem.

I create ~/.config/systemd/user/test.service with the following:

[Service]
ExecStart=/usr/bin/bash -c "echo before sleep; sleep 0.1; echo after sleep"

In one terminal I run journalctl --user-unit=test.service --follow. In another terminal, I run (a bunch of times) systemctl --user start test.service.

EXPECTED: I should see both before sleep and after sleep in the output of journalctl.

ACTUAL: I only see before sleep.

Here's what shows up in the -o verbose logs:

Tue 2016-03-29 20:41:14.671948 EDT [s=e9568fd8d1454a3980cda3097f710645;i=152f1f;b=21d5a26bac4b44a89e22b5cc7622f804;m=ab73464fe0;t=52f3967343f4c;x=74e3c7d0a425d757]
    _TRANSPORT=stdout
    PRIORITY=6
    _UID=1000
    _GID=1000
    _CAP_EFFECTIVE=0
    _SYSTEMD_OWNER_UID=1000
    _SYSTEMD_SLICE=user-1000.slice
    _MACHINE_ID=513a1e851c594baaaeefc79eca7fb0ab
    _HOSTNAME=arch-host
    SYSLOG_FACILITY=3
    _SYSTEMD_UNIT=user@1000.service
    _BOOT_ID=21d5a26bac4b44a89e22b5cc7622f804
    _EXE=/usr/bin/bash
    SYSLOG_IDENTIFIER=bash
    _COMM=bash
    _SYSTEMD_CGROUP=/user.slice/user-1000.slice/user@1000.service/test.service
    _SYSTEMD_USER_UNIT=test.service
    MESSAGE=before sleep
    _PID=3818
    _CMDLINE=/usr/bin/bash -c echo before sleep; sleep 0.1; echo after sleep
Tue 2016-03-29 20:41:14.772507 EDT [s=e9568fd8d1454a3980cda3097f710645;i=152f20;b=21d5a26bac4b44a89e22b5cc7622f804;m=ab7347d8af;t=52f396735c81b;x=eb4c5410c92a5ef4]
    _TRANSPORT=stdout
    PRIORITY=6
    _UID=1000
    _GID=1000
    _CAP_EFFECTIVE=0
    _MACHINE_ID=513a1e851c594baaaeefc79eca7fb0ab
    _HOSTNAME=arch-host
    SYSLOG_FACILITY=3
    _SYSTEMD_CGROUP=/
    _SYSTEMD_SLICE=-.slice
    _BOOT_ID=21d5a26bac4b44a89e22b5cc7622f804
    SYSLOG_IDENTIFIER=bash
    _COMM=bash
    _PID=3818
    MESSAGE=after sleep

Note that _SYSTEMD_USER_UNIT is not defined for the second message.

@evverx
Copy link
Member

evverx commented Mar 30, 2016

See #1456 (comment)

But anyway, you are running into a well-known race: the metadata we get via the PID of peer, but that fails if the process already exited by the time we try to get the metadata. A fix for this requires a kernel change, so that we get the metadata at the same time as the messages.

See also https://bugs.freedesktop.org/show_bug.cgi?id=50184

#2280 is trying to improve the situation

@oconnor663
Copy link
Author

@evverx good to know, thanks. Do you know why this doesn't seem to apply to system-level jobs, or was that maybe just a coincidence?

Also a super naive question: If systemd created the stdout pipe that bash is writing too, and systemd had all the unit metadata at that time, why does it need to ask bash for that same metadata later when it reads from the pipe?

@evverx
Copy link
Member

evverx commented Mar 30, 2016

If systemd created the stdout pipe that bash is writing too, and systemd had all the unit metadata at that time, why does it need to ask bash for that same metadata later when it reads from the pipe?

You can change something:)

How to change _SYSTEMD_UNIT on the fly:

root:~# systemd-run sh -c 'while :; do echo Hola; sleep 10; done'
Running as unit run-rbb0c417419294b44b6b77b1a30ab12c6.service.

root:~# systemd-run sh -c 'while :; do echo Bro; sleep 10; done'
Running as unit run-r97cdfc6a615c43579e094b2df38fb42d.service.

root:~# systemctl status run-rbb0c417419294b44b6b77b1a30ab12c6.service
● run-rbb0c417419294b44b6b77b1a30ab12c6.service - /bin/sh -c while :; do echo Hola; sleep 10; done
   Loaded: loaded
Transient: yes
  Drop-In: /run/systemd/system/run-rbb0c417419294b44b6b77b1a30ab12c6.service.d
           └─50-Description.conf, 50-ExecStart.conf
   Active: active (running) since Wed 2016-03-30 03:03:16 UTC; 21s ago
 Main PID: 7927 (sh)
    Tasks: 2 (limit: 512)
   Memory: 196.0K
      CPU: 4ms
   CGroup: /system.slice/run-rbb0c417419294b44b6b77b1a30ab12c6.service
           ├─7927 /bin/sh -c while :; do echo Hola; sleep 10; done
           └─7940 sleep 10

root:~# echo 7927 >/sys/fs/cgroup/systemd/system.slice/run-r97cdfc6a615c43579e094b2df38fb42d.service/cgroup.procs

root:~# journalctl -b _PID=7927 -o verbose | grep _SYSTEMD_UNIT | sort -u
    _SYSTEMD_UNIT=run-r97cdfc6a615c43579e094b2df38fb42d.service
    _SYSTEMD_UNIT=run-rbb0c417419294b44b6b77b1a30ab12c6.service

Do you know why this doesn't seem to apply to system-level jobs, or was that maybe just a coincidence?

I think that was a coincidence.
systemd-journal sometimes loses _CMDLINE, _COMM, _EXE, _SYSTEMD_CGROUP, _SYSTEMD_UNIT

@poettering
Copy link
Member

yeah, the race applies to all kinds of services.

@poettering poettering changed the title journalctl --user-unit=... is missing log lines journald is unable to attribute messages incoming from processes that exited to their cgroup, due to /proc vs SCM_CREDS race Apr 15, 2016
@poettering poettering added bug 🐛 Programming errors, that need preferential fixing journal labels Apr 15, 2016
@alban
Copy link
Member

alban commented May 17, 2016

Do you know of a workaround? Like something that could be written in the .service file?

@evverx
Copy link
Member

evverx commented May 17, 2016

@alban , try SyslogIdentifier=

Sets the process name to prefix log lines sent to the logging system or the kernel log buffer with.

and run journalctl _SYSTEMD_UNIT=unit + UNIT=unit + SYSLOG_IDENTIFIER=id

-bash-4.3# systemctl cat test-with-syslog-identifier --no-pager
# /run/systemd/system/test-with-syslog-identifier.service
[Service]
ExecStart=/bin/date
User=1000
SyslogIdentifier=heya-short-lived-date

-bash-4.3# systemctl start test-with-syslog-identifier
-bash-4.3# journalctl --sync

-bash-4.3# journalctl _SYSTEMD_UNIT=test-with-syslog-identifier.service + UNIT=test-with-syslog-identifier.service + SYSLOG_IDENTIFIER=heya-short-lived-date --no-hostname  -q | grep -i utc
May 17 04:51:13 heya-short-lived-date[174]: Tue May 17 04:51:13 UTC 2016

-bash-4.3# journalctl -u test-with-syslog-identifier --no-hostname -q | grep -i utc

Anyway, you lose some metadata:

-bash-4.3#  journalctl SYSLOG_IDENTIFIER=heya-short-lived-date --no-hostname -o verbose --no-pager -q
Tue 2016-05-17 04:51:13.998992 UTC [s=2a68b04bb2ef43989b6cd8140231f1d3;i=159d;b=d16c417784544164a1b4fe27b118bdd4;m=e897ed49c;t=533027d9dd090;x=790b88bff65a7412]
    SYSLOG_FACILITY=3
    _UID=1000
    _GID=1000
    _CAP_EFFECTIVE=0
    _SYSTEMD_CGROUP=/
    _SYSTEMD_SLICE=-.slice
    _BOOT_ID=d16c417784544164a1b4fe27b118bdd4
    _MACHINE_ID=d0f5a6724bb44ea4bf7fc18814287439
    _HOSTNAME=systemd-testsuite
    _COMM=date
    _TRANSPORT=stdout
    PRIORITY=6
    _PID=174
    SYSLOG_IDENTIFIER=heya-short-lived-date
    MESSAGE=Tue May 17 04:51:13 UTC 2016

iaguis added a commit to kinvolk/rkt that referenced this issue May 17, 2016
There's a race[1][1] on systemd that causes the systemd unit name not
getting written to the journal for short-lived non-root services.

To provide a way to identify an app within a pod in the logs, we set
`SyslogIdentifierp` to the app name as a workaround.

This causes processes forked by the app and pre/post start hooks to be
identified as the app, but it was already happening before we removed
appexec[2][2] with the binary exec'd by the app instead of the app name.

In theory we could remove that and we'll get the binary name of each
process executed in the pod as the syslog identifier, but then this
workaround would not *work*.

Setting the app name as the identifier is already an improvement over
the previous situation.

[1]: systemd/systemd#2913
[2]: rkt#2493
iaguis added a commit to kinvolk/rkt that referenced this issue May 17, 2016
There's a race[1][1] on systemd that causes the systemd unit name not
getting written to the journal for short-lived non-root services.

To provide a way to identify an app within a pod in the logs, we set
`SyslogIdentifierp` to the app name as a workaround.

This causes processes forked by the app and pre/post start hooks to be
identified as the app, but it was already happening before we removed
appexec[2][2] with the binary exec'd by the app instead of the app name.

In theory we could remove that and we'll get the binary name of each
process executed in the pod as the syslog identifier, but then this
workaround would not *work*.

Setting the app name as the identifier is already an improvement over
the previous situation.

[1]: systemd/systemd#2913
[2]: rkt#2493
tmrts pushed a commit to tmrts/rkt that referenced this issue May 25, 2016
There's a race[1][1] on systemd that causes the systemd unit name not
getting written to the journal for short-lived non-root services.

To provide a way to identify an app within a pod in the logs, we set
`SyslogIdentifierp` to the app name as a workaround.

This causes processes forked by the app and pre/post start hooks to be
identified as the app, but it was already happening before we removed
appexec[2][2] with the binary exec'd by the app instead of the app name.

In theory we could remove that and we'll get the binary name of each
process executed in the pod as the syslog identifier, but then this
workaround would not *work*.

Setting the app name as the identifier is already an improvement over
the previous situation.

[1]: systemd/systemd#2913
[2]: rkt#2493
@poettering poettering mentioned this issue Jul 21, 2016
2 tasks
@vcaputo
Copy link
Member

vcaputo commented Dec 24, 2016

I have an idea for fixing this more properly, but it involves introducing a new linux-specific AF_UNIX socket behavior to the kernel:
Make SO_LINGER applicable to AF_UNIX sockets, with slightly different semantics vs. AF_INET; block close from completing until all written bytes have been consumed from the socket, even implicit close on process exit.

With this option enabled, rearrange journald to acquire the process metadata upon POLLIN but prior to consuming the log data from the socket, and applying the metadata to all lines present in the consumed buffer.

This way ephemeral processes and logs sent immediately before process exit cannot race with the metadata sampling.

Something could also be introduced for fd passes when this option is enabled as well: synchronously wait for in-flight data to be drained prior to performing the fd pass. This would allow journald the opportunity to sample /proc for any log data before the fd is moved to the destination process.

There's no requirement that this be SO_LINGER, it could be a new name, since the semantics are little different (blocking implicit close on exit).

Just putting this out there, food for thought.

I was going to start hacking on this but #2473 gave me cause to retreat considering this takes us further down the AF_UNIX road. Please see my comment there as well (#2473 (comment)).

@poettering
Copy link
Member

lucaswerkmeister added a commit to lucaswerkmeister/git that referenced this issue Jan 22, 2018
Several options imply --syslog, without there being a way to disable it
again. This commit adds that option.

This is useful, for instance, when running `git daemon` as a systemd
service with --inetd. systemd connects stderr to the journal by default,
so logging to stderr is useful. On the other hand, log messages sent via
syslog also reach the journal eventually, but run the risk of being
processed at a time when the `git daemon` process has already exited
(especially if the process was very short-lived, e.g. due to client
error), so that the journal can no longer read its cgroup and attach the
message to the correct systemd unit. See systemd/systemd#2913 [1].

[1]: systemd/systemd#2913

Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de>
@bl33pbl0p
Copy link
Contributor

Just leaving a comment here, I saw a unit also missing _SYSTEMD_INVOCATION_ID=, which makes perfect sense, it seems it reads it from the what the link points to in /run/systemd/units but that can go away before it can readlink the dangling symlink. We were using _SYSTEMD_INVOCATION_ID= to view logs from only the last invocation of the unit, but the unreliability makes it a lot more difficult (so now we fallback to filtering by _PID in combination with it). This seems easy to fix because systemd can create the link once and only update it the next time a unit starts and gets a new invocation id, thus preventing the race.

@lucaswerkmeister
Copy link
Member

lucaswerkmeister commented Jan 27, 2018

I’m a bit confused about when this race condition happens and when it doesn’t happen. I’ll try to summarize my current understanding and hope it’s not too wrong, and then perhaps someone can fill in the missing bits :)

In theory, I don’t see why log messages via stdout/stderr should have to be racy at all. systemd connects them to the journal during setup and can also inform the journal of the unit and other meta-info at the same time. (Note that the “stream logging” section of systemd-journald(8) explicitly states that “metadata for records transferred via such standard output/error streams reflect the metadata of the peer the stream was originally created for. If the stream connection is passed on to other processes … the log records … will continue to describe the original process”, supporting this assumption.) On the other hand, syslog() connects to /dev/log using SOCK_DGRAM, so there’s a possible race condition.

In practice, this communication of the unit name was added in 62bca2c, first shipped in systemd v186 – but only for system units (later slightly amended in c867611). So I think the original bug report here is actually accurate in saying that, as of systemd v229, for stdout/stderr (as in test.service, which uses echo), _SYSTEMD_USER_UNIT was affected by this race but _SYSTEMD_UNIT was not. (Note that the linked fd.o bug, systemd-journal sometimes loses _CMDLINE, _COMM, _EXE, _SYSTEMD_CGROUP, _SYSTEMD_UNIT, predates 62bca2c.)

Now, enter 22e3a02 / #6392. With this change (see the long description in journald-context.c), metadata is cached by process ID, and stream clients (e. g. units with stdout or stderr connected to the daemon) pin a cache entry, which is then reused for any messages received from that PID, regardless of the transport being used. Therefore, if I understand it correctly, starting with systemd v235, even messages sent via syslog should always have the correct meta information attached as long as the unit has stdout or stderr connected to the journal, even if it doesn’t use them at all. The only case in which the race condition could still happen is if there are more than five seconds between a process exiting and the journal processing its message, in which case the cache entry is considered too stale to be used.

@bl33pbl0p
Copy link
Contributor

bl33pbl0p commented Jan 28, 2018

I was also diving in the code used for LogExtraField=, and exploring if it was possible to use the same mechanism to attach the unit name to messages (because we already use it do so, and have found it to be race free, with LogExtraField=PROGRAM_UNIT=%n).

@poettering
Copy link
Member

This seems easy to fix because systemd can create the link once and only update it the next time a unit starts and gets a new invocation id, thus preventing the race.

well, we need some clean-up scheme there. units come and go all the time, and run a single time only, and we cannot leave data in /run around unbounded... Just think "systemd-run" for example which creates a unit that is exactly run once and never again

gitster pushed a commit to git/git that referenced this issue Feb 6, 2018
This new option can be used to override the implicit --syslog of
--inetd, or to disable all logging. (While --detach also implies
--syslog, --log-destination=stderr with --detach is useless since
--detach disassociates the process from the original stderr.) --syslog
is retained as an alias for --log-destination=syslog.

--log-destination always overrides implicit --syslog regardless of
option order. This is different than the “last one wins” logic that
applies to some implicit options elsewhere in Git, but should hopefully
be less confusing. (I also don’t know if *all* implicit options in Git
follow “last one wins”.)

The combination of --inetd with --log-destination=stderr is useful, for
instance, when running `git daemon` as an instanced systemd service
(with associated socket unit). In this case, log messages sent via
syslog are received by the journal daemon, but run the risk of being
processed at a time when the `git daemon` process has already exited
(especially if the process was very short-lived, e.g. due to client
error), so that the journal daemon can no longer read its cgroup and
attach the message to the correct systemd unit (see systemd/systemd#2913
[1]). Logging to stderr instead can solve this problem, because systemd
can connect stderr directly to the journal daemon, which then already
knows which unit is associated with this stream.

[1]: systemd/systemd#2913

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@berarma
Copy link

berarma commented Feb 26, 2023

Why keep this option when it's unreliable?

@DemiMarie
Copy link

At least in some cases this can be solved by having processes send their own pidfd to systemd-journald.

@poettering
Copy link
Member

@DemiMarie journald implements the BSD syslog proto. There are about a million implementations of the client side for that around, including in glibc. None of those send a pidfd of themselves along.

(There's progress on that front btw, some people are prepping a kernel patch for adding SCM_PIDFD that works alongside SCM_CREDENTIALS and allows receivers to get the sending pid as pidfd. But this is not sufficient to close the issue fully, since this won't pin the /proc/ entry, i.e. while we might safely reference the sending process that way and will never mistake it for something else if that process died we still cannot read info from it)

@DemiMarie
Copy link

(There's progress on that front btw, some people are prepping a kernel patch for adding SCM_PIDFD that works alongside SCM_CREDENTIALS and allows receivers to get the sending pid as pidfd. But this is not sufficient to close the issue fully, since this won't pin the /proc/ entry, i.e. while we might safely reference the sending process that way and will never mistake it for something else if that process died we still cannot read info from it)

What about having SCM_PIDFD pin the /proc entry, or having an SCM_PROCFD that gets a /proc FD?

@vcaputo
Copy link
Member

vcaputo commented Feb 27, 2023

(There's progress on that front btw, some people are prepping a kernel patch for adding SCM_PIDFD that works alongside SCM_CREDENTIALS and allows receivers to get the sending pid as pidfd. But this is not sufficient to close the issue fully, since this won't pin the /proc/ entry, i.e. while we might safely reference the sending process that way and will never mistake it for something else if that process died we still cannot read info from it)

Seriously? So you can't just use the PIDFD as a dirfd with the *at() syscalls to access all the children?

Surprising? No. Disappointing? Yes. I'm pretty sure pidfd is primarily about sending signals processes by PID race-free, but we have more pressing needs than just that.

@yan12125
Copy link
Contributor

yan12125 commented Jan 6, 2024

Seriously? So you can't just use the PIDFD as a dirfd with the *at() syscalls to access all the children?

There was a patch from 2019 [1] that adds an ioctl command PIDFD_GET_PROCFD, which gives a dirfd from a pidfd, but somehow it is not merged along with other pidfd-related patches.

[1] https://lore.kernel.org/lkml/20190329155425.26059-3-christian@brauner.io/

DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 5, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 6, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 6, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 6, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this issue May 6, 2024
Avoid hitting systemd#2913 by adding
some more sleeps. This is required to make the test pass when executed
with mkosi on my machine.
@jbosboom
Copy link
Contributor

As a workaround for services you can change the code of that use the native journal protocol, you can synchronize with journald's message processing by sending a datagram containing the write end of a pipe, closing your copy of the write end, and blocking on reading from the read end, which will return EOF when journald closes its copy of the write end. Unix datagrams are not reordered and once sd-event calls into server_process_datagram, it calls server_process_native_message which calls client_context_get before yielding to the event loop. There's no ordering against syslog or stdout/stderr stream logging. If you use this journal barrier just before exiting it may add latency to service exit. Also, journald will log an error or warning to the system journal as this is a native journal protocol violation.

I am not affiliated with systemd, this is not API, this comment is based on my empirical experiments ("works for me") and code reading, there's even less warranty than the usual no warranty.

If you send just the FD, journald will log an error (file not regular), but if you also send data, it logs a warning (too many file descriptors). If it were possible to send an FD that stats as a regular file with size 0 and detect when journald had closed it, that would avoid the warning/error, but I don't believe that's possible. If you send data with the FD to get the warning rather than the error, I suggest BARRIER=1 by analogy with the sd_notify barrier protocol. It's a syntactically valid field that obviously is not part of a normal message, might actually be implemented for this purpose, and in any case seems unlikely to be used for other purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing journal
Development

No branches or pull requests