Skip to content

File Descriptor Store can be used from ExecStartPre= but missing LISTEN_FDS/FDNAMES #37192

@mk-fg

Description

@mk-fg

systemd version the issue has been seen with

257.5-1-arch

Used distribution

Arch (current)

Linux kernel version used

6.12

CPU architectures issue was seen on

None

Component

systemd

Expected behaviour you didn't see

Using test.service file like this:

[Service]
Type=notify
NotifyAccess=exec
FileDescriptorStoreMax=32
DynamicUser=yes
ExecStartPre=+load-ebpfs-and-store-map-fds
ExecStart=unprivileged-daemon-that-uses-stored-ebpf-maps

I expected ExecStartPre=+... to either have access to FileDescriptorStore or not, which doesn't seem to be well-documented.

My use-case here is to run privileged process that would pass long-term-use file descriptors to a main unprivileged one, without the need for privilege drop implemented in there or ambient capabilities.
After initial start, that privileged process is supposed to detect already-stored file descriptors and exit without needing to do anything.

Unexpected behaviour you saw

In reality, ExecStartPre=+... process appears to be able to add fds via sd_pid_notifyf_with_fds(..., "FDSTORE=1\n...) but not see them there on service restarts:

  • sd_listen_fds_with_names() returns 0, even when systemctl show test lists multiple stored descriptors.
  • Environment has NOTIFY_SOCKET and FDSTORE vars, but not LISTEN_FDS and such.
  • File descriptors starting from SD_LISTEN_FDS_START are not open/passed to ExecStartPre=+... process.
  • Adding fds to store via FDSTORE=1 seem to work without issues.
  • ExecStart=... process has LISTEN_FDS/FDNAMES and can access those fds stored by ExecStartPre=+....

Problem here is that ExecStartPre=+... process needs to be aware of already-stored file descriptors as well, to not reopen same things there needlessly, or maybe to close stored ones as-needed.
With an obvious sd_listen_fds_with_names() check returning 0, it will just keep opening and adding fds until FileDescriptorStoreMax= limit is exhausted.

If LISTEN_FDS/FDNAMES vars and fds themselves were passed there, it'd be possible, but this "can store but cannot see" half-way operation looks like a bug.
I haven't came accross any definitive mentions of how File Descriptor Store supposed to interact with pre/post commands in documentation, but might've simply missed it.

It'd be nice if full fd-store access was provided to an ExecStartPre= command.
Alternatively, if such command is not supposed to have access (presumably for good reasons), FDSTORE= probably shouldn't be passed in env to it, and adding fds via FDSTORE=1 probably shouldn't work either.
If behavior of ExecStartPre= and ExecStart= is different in this regard, maybe it should be documented how/why, e.g. in ExecStartPre= directive description or on "The File Descriptor Store" page.

Steps to reproduce the problem

  • Use test.service unit template above.
  • Use ExecStartPre=+... command that stores file descriptor via sd_notify socket, prints env, checks fds 3+ via fcntl, and ExecStart=sh -c 'env; sleep infinity'.
  • Run systemctl restart test, note how first command doesn't see fds that it just stored for such restart.

Additional program output to the terminal or log subsystem illustrating the issue

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions