-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unit fails with EXIT_EXEC / NAMESPACE after upgrade von v247 to v249 / v248 #20514
Labels
bug 🐛
Programming errors, that need preferential fixing
pid1
regression ⚠️
A bug in something that used to work correctly and broke through some recent commit
Comments
yuwata
added a commit
to yuwata/systemd
that referenced
this issue
Aug 22, 2021
Follow-up for 888f65a. Hopefully fixes systemd#20514.
Note that the |
yuwata
added a commit
to yuwata/systemd
that referenced
this issue
Aug 23, 2021
Follow-up for 888f65a. Hopefully fixes systemd#20514.
@yuwata this does indeed fix the error that I described above. I'm seeing another inconsistency with previous version around the handling of |
andir
added a commit
to andir/nixpkgs
that referenced
this issue
Aug 30, 2021
This updates systemd to version v249.4 from version v247.6. Besides the many new features that can be found in the upstream repository they also introduced a bunch of cleanup which ended up requiring a few more patches on our side. a) 0022-core-Handle-lookup-paths-being-symlinks.patch: The way symlinked units were handled was changed in such that the last name of a unit file within one of the unit directories (/run/systemd/system, /etc/systemd/system, ...) is used as the name for the unit. Unfortunately that code didn't take into account that the unit directories themselves could already be symlinks and thus caused all our units to be recognized slightly different. There is an upstream PR for this new patch: systemd/systemd#20479 b) The way the APIVFS is setup has been changed in such a way that we now always have /run. This required a few changes to the confinement tests which did assert that they didn't exist. Instead of adding another patch we can just adopt the upstream behavior. An empty /run doesn't seem harmful. As part of this work I refactored the confinement test just a little bit to allow better debugging of test failures. Previously it would just fail at some point and it wasn't obvious which of the many commands failed or what the unexpected string was. This should now be more obvious. c) Again related to the confinement tests the way a file was tested for being accessible was optimized. Previously systemd would in some situations open a file twice during that check. This was reduced to one operation but required the procfs to be mounted in a units namespace. An upstream bug was filed and fixed. We are now carrying the essential patch to fix that issue until it is backported to a new release (likely only version 250). The good part about this story is that upstream systemd now has a test case that looks very similar to one of our confinement tests. Hopefully that will lead to less friction in the long run. systemd/systemd#20514 systemd/systemd#20515 d) Previously we could grep for dlopen( somewhat reliably but now upstream started using a wrapper around dlopen that is most of the time used with linebreaks. This makes using grep not ergonomic anymore. With this bump we are grepping for anything that looks like a dynamic library name (in contrast to a dlopen(3) call) and replace those instead. That seems more robust. Time will tell if this holds. I tried using coccinelle to patch all those call sites using its tooling but unfornately it does stumble upon the _cleanup_ annotations that are very common in the systemd code. e) We now have some machinery for libbpf support in our systemd build. That being said it doesn't actually work as generating some skeletons doesn't work just yet. It fails with the below error message and is disabled by default (in both minimal and the regular build). > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h > libbpf: elf: socket_bind_bpf is not a valid eBPF object file > Error: failed to open BPF object file: BPF object format invalid > Traceback (most recent call last): > File "/build/source/tools/build-bpf-skel.py", line 128, in <module> > bpf_build(args) > File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build > gen_bpf_skeleton(bpftool_exec=args.bpftool_exec, > File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton > skel = subprocess.check_output(bpftool_args, universal_newlines=True) > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output > return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run > raise CalledProcessError(retcode, process.args, > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255. > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut) > ninja: build stopped: subcommand failed. f) We do now have support for TPM2 based disk encryption in our systemd build. The actual bits and pieces to make use of that are missing but there are various ongoing efforts in that direction. There is also the story about systemd in our initrd to enable this being used for root volumes. None of this will yet work out of the box but we can start improving on that front. g) FIDO2 support was added systemd and consequently we can now use that. Just with TPM2 there hasn't been any integration work with NixOS and instead this just adds that capability to work on that. Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
yu-re-ka
pushed a commit
to yu-re-ka/nixpkgs
that referenced
this issue
Sep 7, 2021
This updates systemd to version v249.4 from version v247.6. Besides the many new features that can be found in the upstream repository they also introduced a bunch of cleanup which ended up requiring a few more patches on our side. a) 0022-core-Handle-lookup-paths-being-symlinks.patch: The way symlinked units were handled was changed in such that the last name of a unit file within one of the unit directories (/run/systemd/system, /etc/systemd/system, ...) is used as the name for the unit. Unfortunately that code didn't take into account that the unit directories themselves could already be symlinks and thus caused all our units to be recognized slightly different. There is an upstream PR for this new patch: systemd/systemd#20479 b) The way the APIVFS is setup has been changed in such a way that we now always have /run. This required a few changes to the confinement tests which did assert that they didn't exist. Instead of adding another patch we can just adopt the upstream behavior. An empty /run doesn't seem harmful. As part of this work I refactored the confinement test just a little bit to allow better debugging of test failures. Previously it would just fail at some point and it wasn't obvious which of the many commands failed or what the unexpected string was. This should now be more obvious. c) Again related to the confinement tests the way a file was tested for being accessible was optimized. Previously systemd would in some situations open a file twice during that check. This was reduced to one operation but required the procfs to be mounted in a units namespace. An upstream bug was filed and fixed. We are now carrying the essential patch to fix that issue until it is backported to a new release (likely only version 250). The good part about this story is that upstream systemd now has a test case that looks very similar to one of our confinement tests. Hopefully that will lead to less friction in the long run. systemd/systemd#20514 systemd/systemd#20515 d) Previously we could grep for dlopen( somewhat reliably but now upstream started using a wrapper around dlopen that is most of the time used with linebreaks. This makes using grep not ergonomic anymore. With this bump we are grepping for anything that looks like a dynamic library name (in contrast to a dlopen(3) call) and replace those instead. That seems more robust. Time will tell if this holds. I tried using coccinelle to patch all those call sites using its tooling but unfornately it does stumble upon the _cleanup_ annotations that are very common in the systemd code. e) We now have some machinery for libbpf support in our systemd build. That being said it doesn't actually work as generating some skeletons doesn't work just yet. It fails with the below error message and is disabled by default (in both minimal and the regular build). > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h > libbpf: elf: socket_bind_bpf is not a valid eBPF object file > Error: failed to open BPF object file: BPF object format invalid > Traceback (most recent call last): > File "/build/source/tools/build-bpf-skel.py", line 128, in <module> > bpf_build(args) > File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build > gen_bpf_skeleton(bpftool_exec=args.bpftool_exec, > File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton > skel = subprocess.check_output(bpftool_args, universal_newlines=True) > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output > return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run > raise CalledProcessError(retcode, process.args, > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255. > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut) > ninja: build stopped: subcommand failed. f) We do now have support for TPM2 based disk encryption in our systemd build. The actual bits and pieces to make use of that are missing but there are various ongoing efforts in that direction. There is also the story about systemd in our initrd to enable this being used for root volumes. None of this will yet work out of the box but we can start improving on that front. g) FIDO2 support was added systemd and consequently we can now use that. Just with TPM2 there hasn't been any integration work with NixOS and instead this just adds that capability to work on that. Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
andir
added a commit
to andir/nixpkgs
that referenced
this issue
Sep 12, 2021
This updates systemd to version v249.4 from version v247.6. Besides the many new features that can be found in the upstream repository they also introduced a bunch of cleanup which ended up requiring a few more patches on our side. a) 0022-core-Handle-lookup-paths-being-symlinks.patch: The way symlinked units were handled was changed in such that the last name of a unit file within one of the unit directories (/run/systemd/system, /etc/systemd/system, ...) is used as the name for the unit. Unfortunately that code didn't take into account that the unit directories themselves could already be symlinks and thus caused all our units to be recognized slightly different. There is an upstream PR for this new patch: systemd/systemd#20479 b) The way the APIVFS is setup has been changed in such a way that we now always have /run. This required a few changes to the confinement tests which did assert that they didn't exist. Instead of adding another patch we can just adopt the upstream behavior. An empty /run doesn't seem harmful. As part of this work I refactored the confinement test just a little bit to allow better debugging of test failures. Previously it would just fail at some point and it wasn't obvious which of the many commands failed or what the unexpected string was. This should now be more obvious. c) Again related to the confinement tests the way a file was tested for being accessible was optimized. Previously systemd would in some situations open a file twice during that check. This was reduced to one operation but required the procfs to be mounted in a units namespace. An upstream bug was filed and fixed. We are now carrying the essential patch to fix that issue until it is backported to a new release (likely only version 250). The good part about this story is that upstream systemd now has a test case that looks very similar to one of our confinement tests. Hopefully that will lead to less friction in the long run. systemd/systemd#20514 systemd/systemd#20515 d) Previously we could grep for dlopen( somewhat reliably but now upstream started using a wrapper around dlopen that is most of the time used with linebreaks. This makes using grep not ergonomic anymore. With this bump we are grepping for anything that looks like a dynamic library name (in contrast to a dlopen(3) call) and replace those instead. That seems more robust. Time will tell if this holds. I tried using coccinelle to patch all those call sites using its tooling but unfornately it does stumble upon the _cleanup_ annotations that are very common in the systemd code. e) We now have some machinery for libbpf support in our systemd build. That being said it doesn't actually work as generating some skeletons doesn't work just yet. It fails with the below error message and is disabled by default (in both minimal and the regular build). > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h > libbpf: elf: socket_bind_bpf is not a valid eBPF object file > Error: failed to open BPF object file: BPF object format invalid > Traceback (most recent call last): > File "/build/source/tools/build-bpf-skel.py", line 128, in <module> > bpf_build(args) > File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build > gen_bpf_skeleton(bpftool_exec=args.bpftool_exec, > File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton > skel = subprocess.check_output(bpftool_args, universal_newlines=True) > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output > return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run > raise CalledProcessError(retcode, process.args, > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255. > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut) > ninja: build stopped: subcommand failed. f) We do now have support for TPM2 based disk encryption in our systemd build. The actual bits and pieces to make use of that are missing but there are various ongoing efforts in that direction. There is also the story about systemd in our initrd to enable this being used for root volumes. None of this will yet work out of the box but we can start improving on that front. g) FIDO2 support was added systemd and consequently we can now use that. Just with TPM2 there hasn't been any integration work with NixOS and instead this just adds that capability to work on that. Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
yu-re-ka
pushed a commit
to yu-re-ka/nixpkgs
that referenced
this issue
Sep 13, 2021
This updates systemd to version v249.4 from version v247.6. Besides the many new features that can be found in the upstream repository they also introduced a bunch of cleanup which ended up requiring a few more patches on our side. a) 0022-core-Handle-lookup-paths-being-symlinks.patch: The way symlinked units were handled was changed in such that the last name of a unit file within one of the unit directories (/run/systemd/system, /etc/systemd/system, ...) is used as the name for the unit. Unfortunately that code didn't take into account that the unit directories themselves could already be symlinks and thus caused all our units to be recognized slightly different. There is an upstream PR for this new patch: systemd/systemd#20479 b) The way the APIVFS is setup has been changed in such a way that we now always have /run. This required a few changes to the confinement tests which did assert that they didn't exist. Instead of adding another patch we can just adopt the upstream behavior. An empty /run doesn't seem harmful. As part of this work I refactored the confinement test just a little bit to allow better debugging of test failures. Previously it would just fail at some point and it wasn't obvious which of the many commands failed or what the unexpected string was. This should now be more obvious. c) Again related to the confinement tests the way a file was tested for being accessible was optimized. Previously systemd would in some situations open a file twice during that check. This was reduced to one operation but required the procfs to be mounted in a units namespace. An upstream bug was filed and fixed. We are now carrying the essential patch to fix that issue until it is backported to a new release (likely only version 250). The good part about this story is that upstream systemd now has a test case that looks very similar to one of our confinement tests. Hopefully that will lead to less friction in the long run. systemd/systemd#20514 systemd/systemd#20515 d) Previously we could grep for dlopen( somewhat reliably but now upstream started using a wrapper around dlopen that is most of the time used with linebreaks. This makes using grep not ergonomic anymore. With this bump we are grepping for anything that looks like a dynamic library name (in contrast to a dlopen(3) call) and replace those instead. That seems more robust. Time will tell if this holds. I tried using coccinelle to patch all those call sites using its tooling but unfornately it does stumble upon the _cleanup_ annotations that are very common in the systemd code. e) We now have some machinery for libbpf support in our systemd build. That being said it doesn't actually work as generating some skeletons doesn't work just yet. It fails with the below error message and is disabled by default (in both minimal and the regular build). > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h > libbpf: elf: socket_bind_bpf is not a valid eBPF object file > Error: failed to open BPF object file: BPF object format invalid > Traceback (most recent call last): > File "/build/source/tools/build-bpf-skel.py", line 128, in <module> > bpf_build(args) > File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build > gen_bpf_skeleton(bpftool_exec=args.bpftool_exec, > File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton > skel = subprocess.check_output(bpftool_args, universal_newlines=True) > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output > return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, > File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run > raise CalledProcessError(retcode, process.args, > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255. > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut) > ninja: build stopped: subcommand failed. f) We do now have support for TPM2 based disk encryption in our systemd build. The actual bits and pieces to make use of that are missing but there are various ongoing efforts in that direction. There is also the story about systemd in our initrd to enable this being used for root volumes. None of this will yet work out of the box but we can start improving on that front. g) FIDO2 support was added systemd and consequently we can now use that. Just with TPM2 there hasn't been any integration work with NixOS and instead this just adds that capability to work on that. Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
codepeon
pushed a commit
to codepeon/systemd
that referenced
this issue
Oct 14, 2021
Follow-up for 888f65a. Hopefully fixes systemd#20514. (cherry picked from commit 93413ac)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug 🐛
Programming errors, that need preferential fixing
pid1
regression ⚠️
A bug in something that used to work correctly and broke through some recent commit
systemd version the issue has been seen with
Used distribution
Linux kernel version used (
uname -a
)CPU architecture issue was seen on
Expected behaviour you didn't see
After upgrading a system from version 247 to 249 (or 248) a service unit (as show below) fails during startup with
EXIT_EXEC
and with version 248 it fails during namespace creation.Note: I am not entirely sure the way the unit is written is 100% like it should be. I am just debugging this regression that I noticed while trying to upgrade the systemd package in NixOS.
Unit file as used on Debian (v247):
Unit file as used on Fedora (v248):
Unit as used on NixOS (v247 / v249):
Unexpected behaviour you saw
On v249 the unit fails to start with 203 (EXIT_EXEC). On v248 (Fedora 34) the unit would fail during namespace setup.
v248 (Fedora 34) error
When changing the
BindReadOnlyPaths=
to just/lib64/
and/usr/bin/
it also ends up in a 203.v249 (NixOS) error
v247 (NixOS & Debian) success
Steps to reproduce the problem
systemctl daemon-reload
systemctl start <my-unit>.service
systemctl status <mu-unit>.service
Additional program output to the terminal or log subsystem illustrating the issue
I did run the whole process on v249 under strace in a systemd-nspawn container and here is what I saw:
It seems like systemd tries to access the
ExecStart=
exectuable via/prco/self/fd/%d
before startup which isn't there anymore as it doeschroot
just before that and then there is no/proc
anymore.After setting up all the (ro) bind mounts systemd calls "chroots" to
/var/empty
(ourRootDirectory
):Immediately after it then tries to access the executable.
systemd/src/core/execute.c
Lines 4351 to 4353 in f95d1ef
It calls the
find_exectuable_full
function and eventually thecheck_x_access
function.But there is no
/proc
anymore which leads to setting the exit status to 203:systemd/src/core/execute.c
Lines 4353 to 4367 in f95d1ef
The corresponding mount table (just before the
chdir
&chroot
above):If we filter that for paths in
/var/empty
we get the below list.And indeed there is no
/proc
anymore after thechroot("/var/empty")
call.Here is the unit startup with v249 on NixOS and
SYSTEMD_LOG_LEVEL=debug
:Note: Each of the bind mounts fails first since the target directory doesn't exist yet. The second mount attempt succeeds just like expected by this code:
systemd/src/core/namespace.c
Lines 1367 to 1393 in f95d1ef
The text was updated successfully, but these errors were encountered: