Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit fails with EXIT_EXEC / NAMESPACE after upgrade von v247 to v249 / v248 #20514

Closed
andir opened this issue Aug 22, 2021 · 4 comments · Fixed by #20515
Closed

Unit fails with EXIT_EXEC / NAMESPACE after upgrade von v247 to v249 / v248 #20514

andir opened this issue Aug 22, 2021 · 4 comments · Fixed by #20515
Labels
bug 🐛 Programming errors, that need preferential fixing pid1 regression ⚠️ A bug in something that used to work correctly and broke through some recent commit

Comments

@andir
Copy link
Contributor

andir commented Aug 22, 2021

systemd version the issue has been seen with

v248 & v249

Used distribution

Debian 11, NixOS unstable, Fedora 34

Linux kernel version used (uname -a)

Debian: 5.10.0-8, NixOS: 5.14-rc6, Fedora: 5.13.12-200

CPU architecture issue was seen on

x86_64

Expected behaviour you didn't see

After upgrading a system from version 247 to 249 (or 248) a service unit (as show below) fails during startup with EXIT_EXEC and with version 248 it fails during namespace creation.

Note: I am not entirely sure the way the unit is written is 100% like it should be. I am just debugging this regression that I noticed while trying to upgrade the systemd package in NixOS.

Unit file as used on Debian (v247):

[Unit]

[Service]
BindReadOnlyPaths=/usr/bin/hello
BindReadOnlyPaths=/lib/x86_64-linux-gnu/libc.so.6
BindReadOnlyPaths=/lib64/ld-linux-x86-64.so.2


ExecStart=/usr/bin/hello
MountAPIVFS=false
PrivateDevices=false
PrivateMounts=true
PrivateTmp=false
PrivateUsers=false
ProtectControlGroups=false
ProtectKernelModules=false
ProtectKernelTunables=false
RootDirectory=/var/empty
TemporaryFileSystem=/

Unit file as used on Fedora (v248):

# /etc/systemd/system/test.service
[Unit]

[Service]
BindReadOnlyPaths=/usr/bin/hello
BindReadOnlyPaths=/lib64/libc.so.6
BindReadOnlyPaths=/lib64/ld-linux-x86-64.so.2

ExecStart=/usr/bin/hello
MountAPIVFS=false
PrivateDevices=false
PrivateMounts=true
PrivateTmp=false
PrivateUsers=false
ProtectControlGroups=false
ProtectKernelModules=false
ProtectKernelTunables=false
RootDirectory=/var/empty
TemporaryFileSystem=/

Unit as used on NixOS (v247 / v249):

# systemctl cat basic-confinement.service
# /etc/systemd/system/basic-confinement.service
[Service]
BindReadOnlyPaths=/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash:/bin/sh
BindReadOnlyPaths=/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1
BindReadOnlyPaths=/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23
BindReadOnlyPaths=/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23
BindReadOnlyPaths=/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47
BindReadOnlyPaths=/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start
BindReadOnlyPaths=/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5
BindReadOnlyPaths=/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10
BindReadOnlyPaths=/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2

# points to a shell scripts (with a shebang pointing to `/nix/store/...-bash-interactive-4.4-p23/bin/bash` and then calls `hello` by its absolute path)
ExecStart=/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start/bin/basic-confinement-start 
MountAPIVFS=false
PrivateDevices=false
PrivateMounts=true
PrivateTmp=false
PrivateUsers=false
ProtectControlGroups=false
ProtectKernelModules=false
ProtectKernelTunables=false
RootDirectory=/var/empty
TemporaryFileSystem=/

Unexpected behaviour you saw
On v249 the unit fails to start with 203 (EXIT_EXEC). On v248 (Fedora 34) the unit would fail during namespace setup.

v248 (Fedora 34) error

Aug 22 21:20:59 fedora systemd[1]: Started test.service.
Aug 22 21:20:59 fedora systemd[26092]: Failed to create destination mount point node '/var/empty/lib64/ld-linux-x86-64.so.2': Operation not permitted
Aug 22 21:20:59 fedora systemd[26092]: Failed to mount /usr/lib64/ld-2.33.so to /var/empty/lib64/ld-linux-x86-64.so.2: No such file or directory
Aug 22 21:20:59 fedora systemd[26092]: test.service: Failed to set up mount namespacing: /var/empty/lib64/ld-linux-x86-64.so.2: No such file or directory
Aug 22 21:20:59 fedora systemd[26092]: test.service: Failed at step NAMESPACE spawning /usr/bin/hello: No such file or directory
Aug 22 21:20:59 fedora systemd[1]: test.service: Main process exited, code=exited, status=226/NAMESPACE
Aug 22 21:20:59 fedora systemd[1]: test.service: Failed with result 'exit-code'.

When changing the BindReadOnlyPaths= to just /lib64/ and /usr/bin/ it also ends up in a 203.

v249 (NixOS) error

Aug 22 19:27:31 nixos systemd[1]: Started basic-confinement.service.
Aug 22 19:27:31 nixos systemd[1]: basic-confinement.service: Main process exited, code=exited, status=203/EXEC
Aug 22 19:27:31 nixos systemd[1]: basic-confinement.service: Failed with result 'exit-code'.

v247 (NixOS & Debian) success

Aug 22 21:50:09 nixos systemd[1]: Started basic-confinement.service.
Aug 22 21:50:09 nixos basic-confinement-start[304]: Hello, world!
Aug 22 21:50:09 nixos systemd[1]: basic-confinement.service: Succeeded.

Steps to reproduce the problem

  1. Use your favorite flavor of distribution and create a sevice unit with the appropirate example from above.
  2. systemctl daemon-reload
  3. systemctl start <my-unit>.service
  4. systemctl status <mu-unit>.service

Additional program output to the terminal or log subsystem illustrating the issue

I did run the whole process on v249 under strace in a systemd-nspawn container and here is what I saw:

It seems like systemd tries to access the ExecStart= exectuable via /prco/self/fd/%d before startup which isn't there anymore as it does chroot just before that and then there is no /proc anymore.

After setting up all the (ro) bind mounts systemd calls "chroots" to /var/empty (our RootDirectory):

[pid   367] chdir("/var/empty")         = 0                                                                                                                                                                                                                                                                                                                 
[pid   367] mount("/var/empty", "/", NULL, MS_MOVE, NULL) = 0                                                                                                                                                                                                                                                                                               
[pid   367] chroot(".")                 = 0                                                                                                                                                                                                                                                                                                                 
[pid   367] chdir("/")                  = 0                                                                                                                                                                                                                                                                                                                 
[pid   367] mount(NULL, "/", NULL, MS_REC|MS_SHARED, NULL) = 0                                                                                                                                                                                                                                                                                              
[pid   367] mount(NULL, "/run/systemd/incoming", NULL, MS_SLAVE, NULL) = 0

Immediately after it then tries to access the executable.

systemd/src/core/execute.c

Lines 4351 to 4353 in f95d1ef

_cleanup_free_ char *executable = NULL;
_cleanup_close_ int executable_fd = -1;
r = find_executable_full(command->path, /* root= */ NULL, false, &executable, &executable_fd);

It calls the find_exectuable_full function and eventually the check_x_access function.

[pid   367] openat(AT_FDCWD, "/nix/store/....-unit-script-basic-confinement-start/bin/basic-confinement-start", O_RDONLY|O_CLOEXEC|O_PATH) = 3                                                                                                                                                                                  
[pid   367] newfstatat(3, "", {st_mode=S_IFREG|0555, st_size=76, ...}, AT_EMPTY_PATH) = 0                                                                                                                                                                                                                                                                   
[pid   367] access("/proc/self/fd/3", X_OK) = -1 ENOENT (No such file or directory)                                                                                           
[pid   367] statfs("/proc/", 0x7ffec01d64a0) = -1 ENOENT (No such file or directory)
[pid   367] close(3)                    = 0

But there is no /proc anymore which leads to setting the exit status to 203:

systemd/src/core/execute.c

Lines 4353 to 4367 in f95d1ef

r = find_executable_full(command->path, /* root= */ NULL, false, &executable, &executable_fd);
if (r < 0) {
if (r != -ENOMEM && (command->flags & EXEC_COMMAND_IGNORE_FAILURE)) {
log_unit_struct_errno(unit, LOG_INFO, r,
"MESSAGE_ID=" SD_MESSAGE_SPAWN_FAILED_STR,
LOG_UNIT_INVOCATION_ID(unit),
LOG_UNIT_MESSAGE(unit, "Executable %s missing, skipping: %m",
command->path),
"EXECUTABLE=%s", command->path);
return 0;
}
*exit_status = EXIT_EXEC;
return log_unit_struct_errno(unit, LOG_INFO, r,

The corresponding mount table (just before the chdir & chroot above):

/ /dev rw,nosuid master:226 - tmpfs tmpfs rw,size=4096k,nr_inodes=65536,mode=755,uid=555941888,gid=555941888
/ /dev/mqueue rw,nosuid,nodev,noexec,relatime master:409 - mqueue mqueue rw
/ /dev/pts rw,nosuid,noexec,relatime master:397 - devpts devpts rw,gid=555941891,mode=620,ptmxmode=666
/ /dev/shm rw,nosuid,nodev master:383 - tmpfs tmpfs rw,size=13184448k,nr_inodes=409600,uid=555941888,gid=555941888
/ /proc rw,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/ /run rw,nosuid,nodev master:398 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888
/ /run/keys rw,nosuid,nodev,relatime master:412 - ramfs none rw,mode=750
/ /run/user/0 rw,nosuid,nodev,relatime master:550 - tmpfs tmpfs rw,size=13184448k,nr_inodes=3296112,mode=700,uid=555941888,gid=555941888
/ /run/wrappers rw,nodev,relatime master:413 - tmpfs tmpfs rw,mode=755,uid=555941888,gid=555941888
/ /sys ro,nosuid,nodev,noexec,relatime master:203 - sysfs sysfs rw
/ /sys/fs/cgroup ro,nosuid,nodev,noexec master:410 - tmpfs tmpfs ro,size=4096k,nr_inodes=1024,mode=755,uid=555941888,gid=555941888
/ /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime master:411 - cgroup cgroup rw,xattr,name=systemd
/ /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime master:414 - cgroup2 cgroup2 rw,nsdelegate,memory_recursiveprot
/ /tmp rw,nosuid,nodev master:199 - tmpfs tmpfs rw,size=13184448k,nr_inodes=409600,uid=555941888,gid=555941888
/ /var/empty rw,nodev - tmpfs tmpfs rw,size=13184448k,nr_inodes=409600,mode=755,uid=555941888,gid=555941888
/.#machine.tmp.2R1ZqvKiFU7715321801a98f3e / rw,relatime master:197 - tmpfs tmpfs rw
/.#proc-kmsg57947c1f49c45046//deleted /proc/kmsg rw,nosuid,nodev master:398 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888
/.#proc-sys-kernel-random-boot-id9fc0bec3172289e7//deleted /proc/sys/kernel/random/boot_id ro,nosuid,nodev,noexec master:398 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888
/.#proc-sys-kernel-random-boot-id9fc0bec3172289e7//deleted /proc/sys/kernel/random/boot_id rw,nosuid,nodev master:398 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888
/acpi /proc/acpi ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/asound /proc/asound ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/bus /proc/bus ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/fs /proc/fs ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/host /run/host ro,nosuid,nodev,noexec master:399 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888
/irq /proc/irq ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/scsi /proc/scsi ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/store /nix/store ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash /var/empty/bin/sh ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/jz2jcdy3vkbq8ryw0gr8qw6qlrm3qwcc-tzdata-2021a/share/zoneinfo/Europe/Berlin /etc/localtime ro,nosuid,nodev,relatime master:407 - zfs tank/nix rw,xattr,posixacl
/store/khxp7n920m5cncavw2fah7npcjbnfs1m-etc-os-release /run/host/os-release ro,nosuid,nodev,noexec,relatime master:400 - zfs tank/nix rw,xattr,posixacl
/store/khxp7n920m5cncavw2fah7npcjbnfs1m-etc-os-release /run/host/os-release ro,relatime master:402 - zfs tank/nix rw,xattr,posixacl
/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/sys /proc/sys ro,nosuid,nodev,noexec,relatime master:408 - proc proc rw
/systemd/nspawn/propagate/tmp.2R1ZqvKiFU-b05849ffb33222ee /run/host/incoming ro master:401 - tmpfs tmpfs rw,size=32961120k,mode=755
/systemd/nspawn/propagate/tmp.2R1ZqvKiFU-b05849ffb33222ee /run/host/incoming rw,nosuid,nodev master:403 - tmpfs tmpfs rw,size=32961120k,mode=755
/systemd/propagate/basic-confinement.service /var/empty/run/systemd/incoming ro,nosuid,nodev master:398 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888
/var/nix/daemon-socket /nix/var/nix/daemon-socket ro,relatime master:405 - zfs tank/nix rw,xattr,posixacl
/var/nix/db /nix/var/nix/db ro,relatime master:406 - zfs tank/nix rw,xattr,posixacl

If we filter that for paths in /var/empty we get the below list.

/ /var/empty rw,nodev - tmpfs tmpfs rw,size=13184448k,nr_inodes=409600,mode=755,uid=555941888,gid=555941888
/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash /var/empty/bin/sh ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 ro,relatime master:404 - zfs tank/nix rw,xattr,posixacl
/systemd/propagate/basic-confinement.service /var/empty/run/systemd/incoming ro,nosuid,nodev master:398 - tmpfs tmpfs rw,size=26368896k,nr_inodes=819200,mode=755,uid=555941888,gid=555941888

And indeed there is no /proc anymore after the chroot("/var/empty") call.

Here is the unit startup with v249 on NixOS and SYSTEMD_LOG_LEVEL=debug:

# journalctl -u basic-confinement.service  -n 1000
-- Journal begins at Sun 2021-08-22 22:24:18 CEST, ends at Sun 2021-08-22 22:25:03 CEST. --
Aug 22 22:24:34 nixos systemd[1]: basic-confinement.service: Collecting.
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Trying to enqueue job basic-confinement.service/start/replace
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Installed new job basic-confinement.service/start as 281
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Enqueued job basic-confinement.service/start as 281
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Failed to set invocation ID on control group /system.slice/basic-confinement.service, ignoring: Operation not permitted
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Failed to remove delegate flag on control group /system.slice/basic-confinement.service, ignoring: Operation not permitted
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Passing 0 fds to service
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: About to execute /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start/bin/basic-confinement-start
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Forked /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start/bin/basic-confinement-start as 312
Aug 22 22:24:43 nixos systemd[312]: PR_SET_MM_ARG_START failed: Operation not permitted
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Changed dead -> running
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Job 281 basic-confinement.service/start finished, result=done
Aug 22 22:24:43 nixos systemd[1]: Started basic-confinement.service.
Aug 22 22:24:43 nixos systemd[312]: basic-confinement.service: Kernel keyring access prohibited, ignoring.
Aug 22 22:24:43 nixos systemd[312]: Failed at setting rlimit 524288 for resource RLIMIT_NOFILE. Will attempt setting value 4096 instead.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /var/empty on /var/empty (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/
Aug 22 22:24:43 nixos systemd[312]: Successfully unmounted /var/empty
Aug 22 22:24:43 nixos systemd[312]: Mounting tmpfs (tmpfs) on /var/empty/ (MS_NODEV|MS_STRICTATIME "mode=0755,size=10%,nr_inodes=400k")...
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/bin/sh
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash → /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash on /var/empty/bin/sh (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash (type n/a) on /var/empty/bin/sh (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash on /var/empty/bin/sh (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23/bin/bash to /var/empty/bin/sh
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 → /nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 on /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 (type n/a) on /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 on /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1 to /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 → /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 on /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 (type n/a) on /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 on /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23 to /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 → /nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 on /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 (type n/a) on /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 on /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23 to /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 → /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 on /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 (type n/a) on /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 on /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47 to /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start → /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start on /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start (type n/a) on /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start on /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start to /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 → /nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 on /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 (type n/a) on /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 on /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5 to /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 → /nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 on /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 (type n/a) on /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 on /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10 to /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 → /nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 on /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 (type n/a) on /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 (MS_BIND|MS_REC ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 on /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 (MS_BIND|MS_REC "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2 to /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/run/credentials
Aug 22 22:24:43 nixos systemd[312]: Applying namespace mount on /var/empty/run/systemd/incoming
Aug 22 22:24:43 nixos systemd[312]: Followed source symlinks /run/systemd/propagate/basic-confinement.service → /run/systemd/propagate/basic-confinement.service.
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /run/systemd/propagate/basic-confinement.service on /var/empty/run/systemd/incoming (MS_BIND "")...
Aug 22 22:24:43 nixos systemd[312]: Failed to mount /run/systemd/propagate/basic-confinement.service (type n/a) on /var/empty/run/systemd/incoming (MS_BIND ""): No such file or directory
Aug 22 22:24:43 nixos systemd[312]: Bind-mounting /run/systemd/propagate/basic-confinement.service on /var/empty/run/systemd/incoming (MS_BIND "")...
Aug 22 22:24:43 nixos systemd[312]: Successfully mounted /run/systemd/propagate/basic-confinement.service to /var/empty/run/systemd/incoming
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/bin/sh.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/043h5mh7ax05lnjycay7jc500834nc6l-libidn2-2.3.1.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/39k586qrghljxiyqvv72ni5fk89vshib-bash-interactive-4.4-p23.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/can473ld4dc8izcjlm4i5daxsh1yl5d8-bash-4.4-p23.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/ivhfzqxzgk0474zy9zz9babd1ldkjpfb-unit-script-basic-confinement-start.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/qd39ja6l01kylh3k5bd4s06v7s9q05hv-readline-7.0p5.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/wb33nijl4nzmvrl6rm65kaf2a57q5v4i-libunistring-0.9.10.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/nix/store/yn7nd41grg2mzvzliphalk70hs0rfpdr-ncurses-6.2.
Aug 22 22:24:43 nixos systemd[312]: Remounted /var/empty/run/systemd/incoming.
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Failed to read oom_kill field of memory.events cgroup attribute: No such file or directory
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Child 312 belongs to basic-confinement.service.
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Main process exited, code=exited, status=203/EXEC
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Failed with result 'exit-code'.
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Service will not restart (restart setting)
Aug 22 22:24:43 nixos systemd[1]: basic-confinement.service: Changed running -> failed

Note: Each of the bind mounts fails first since the target directory doesn't exist yet. The second mount attempt succeeds just like expected by this code:

systemd/src/core/namespace.c

Lines 1367 to 1393 in f95d1ef

r = mount_nofollow_verbose(LOG_DEBUG, what, mount_entry_path(m), NULL, MS_BIND|(rbind ? MS_REC : 0), NULL);
if (r < 0) {
bool try_again = false;
if (r == -ENOENT && make) {
int q;
/* Hmm, either the source or the destination are missing. Let's see if we can create
the destination, then try again. */
(void) mkdir_parents(mount_entry_path(m), 0755);
q = make_mount_point_inode_from_path(what, mount_entry_path(m), 0755);
if (q < 0)
log_error_errno(q, "Failed to create destination mount point node '%s': %m",
mount_entry_path(m));
else
try_again = true;
}
if (try_again)
r = mount_nofollow_verbose(LOG_DEBUG, what, mount_entry_path(m), NULL, MS_BIND|(rbind ? MS_REC : 0), NULL);
if (r < 0)
return log_error_errno(r, "Failed to mount %s to %s: %m", what, mount_entry_path(m));
}
log_debug("Successfully mounted %s to %s", what, mount_entry_path(m));

@yuwata
Copy link
Member

yuwata commented Aug 22, 2021

@andir I am not familiar with nixos, but I hope PR #20515 fixes this issue. PTAL.

@yuwata yuwata added bug 🐛 Programming errors, that need preferential fixing pid1 regression ⚠️ A bug in something that used to work correctly and broke through some recent commit labels Aug 22, 2021
@andir
Copy link
Contributor Author

andir commented Aug 22, 2021

@andir I am not familiar with nixos, but I hope PR #20515 fixes this issue. PTAL.

Thank you for the quick action! I'll have to backport that patch to v249 and will give it a try either tomorrow evening (EU times) or during the next weekend.

yuwata added a commit to yuwata/systemd that referenced this issue Aug 22, 2021
@yuwata
Copy link
Member

yuwata commented Aug 22, 2021

Note that the EXIT_EXEC issue seems to be caused by 888f65a.

yuwata added a commit to yuwata/systemd that referenced this issue Aug 23, 2021
@andir
Copy link
Contributor Author

andir commented Aug 27, 2021

@yuwata this does indeed fix the error that I described above.

I'm seeing another inconsistency with previous version around the handling of /run directories. I'll open another issue if it turns out to be an upstream issue.

andir added a commit to andir/nixpkgs that referenced this issue Aug 30, 2021
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    systemd/systemd#20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   systemd/systemd#20514
   systemd/systemd#20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
yu-re-ka pushed a commit to yu-re-ka/nixpkgs that referenced this issue Sep 7, 2021
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    systemd/systemd#20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   systemd/systemd#20514
   systemd/systemd#20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
andir added a commit to andir/nixpkgs that referenced this issue Sep 12, 2021
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    systemd/systemd#20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   systemd/systemd#20514
   systemd/systemd#20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
yu-re-ka pushed a commit to yu-re-ka/nixpkgs that referenced this issue Sep 13, 2021
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    systemd/systemd#20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   systemd/systemd#20514
   systemd/systemd#20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
codepeon pushed a commit to codepeon/systemd that referenced this issue Oct 14, 2021
Follow-up for 888f65a.

Hopefully fixes systemd#20514.

(cherry picked from commit 93413ac)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing pid1 regression ⚠️ A bug in something that used to work correctly and broke through some recent commit
Development

Successfully merging a pull request may close this issue.

2 participants