nspawn containers should be able to exit with a non-zero status #1290

alban · 2015-09-17T16:04:32Z

I have a nspawn container running systemd as pid1, and it runs several services. I would like that if any service fails, the container terminates and nspawn returns an non-zero exit code ($? in the shell).

I can configure the services to run halt.target on failure and the container will shutdown. But systemd-nspawn will always return the exit code 0. At the moment, this is not possible to return a different value when a service in the container fails.

The way it currently works in the container is:

some service calls systemctl halt
systemd calls systemd-shutdown
systemd-shutdown calls the system call reboot()
since we are not in the initial pid namespace, the kernel will not reboot the computer but instead ~~send SIGINT to pid1 (systemd)~~ pretend that SIGINT is sent to pid1 (systemd). The systemd process and all processes in the container get terminated.
systemd-nspawn notices that systemd was killed by SIGINT and understands that as a normal shutdown
systemd-nspawn returns the exit code 0

After discussion on irc, it could be implemented in this way:

adds ExitCode as property to PID 1's "Manager" bus object
adds a SetExitCode() bus call to set it on the same object, and that returns an error on baremetal, thus allowing early failure
adds support for setting this to "systemctl exit 55"
adds the exit.target + exit.service units to the system instance
changes systemd-shutdown to actually exit with the configured exit code

This feature request was inspired by rkt/rkt#1407 (comment)

/cc @yifan-gu @iaguis @poettering

The text was updated successfully, but these errors were encountered:

poettering · 2015-09-17T16:14:41Z

Looks good to me.

Note that the systemd --user instance does not invoke systemd-shutdown but exits right-away. Of course, the exit code should work for that too.

yifan-gu · 2015-09-17T18:40:04Z

SGTM, thanks @alban !

yifan-gu · 2015-09-18T01:22:42Z

btw, I sent SIGINT to the systemd process in the nspawn. The nspawn does reboot.

poettering · 2015-09-18T09:36:08Z

@yifan-gu yeah, sending SIGINT to PID 1 results in reboot on SysV. The kernel actually never sends SIGINT to the PID1, but instead just reports to nspawn a reboot() called in the container as "killed by signal SIGINT"...

Fix systemd#1290

When a systemd service running in a container exits with a non-zero code, it can be useful to terminate the container immediately and get the exit code back to the host, when systemd-nspawn returns. This was not possible to do. This patch adds the following to make it possible: - Add a read-only "ExitCode" property on PID 1's "Manager" bus object. By default, it is 0 so the behaviour stays the same as previously. - Add a method "SetExitCode" on the same object. The method fails when called on baremetal: it is only allowed in containers or in user session. - Add support in systemctl to call "systemctl exit 42". It reuses the existing code for user session. - Add exit.target to the system instance. It has the following condition: ConditionVirtualization=container. - Change main() to actually allow exit() with the correct value. - Update systemctl manpage. I used the following to test it: | $ sudo rkt --debug --insecure-skip-verify run \ | --mds-register=false --local docker://busybox \ | --exec=/bin/chroot -- /proc/1/root \ | systemctl --force exit 42 | ... | Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42. | $ echo $? | 42 I don't know why I have to use --force. Fixes systemd#1290

When a systemd service running in a container exits with a non-zero code, it can be useful to terminate the container immediately and get the exit code back to the host, when systemd-nspawn returns. This was not possible to do. This patch adds the following to make it possible: - Add a read-only "ExitCode" property on PID 1's "Manager" bus object. By default, it is 0 so the behaviour stays the same as previously. - Add a method "SetExitCode" on the same object. The method fails when called on baremetal: it is only allowed in containers or in user session. - Add support in systemctl to call "systemctl exit 42". It reuses the existing code for user session. - Add exit.target to the system instance. It has the following condition: ConditionVirtualization=container. - Change main() to actually call systemd-shutdown to exit() with the correct value. - Add verb 'exit' in systemd-shutdown with parameter --exit-code - Update systemctl manpage. I used the following to test it: | $ sudo rkt --debug --insecure-skip-verify run \ | --mds-register=false --local docker://busybox \ | --exec=/bin/chroot -- /proc/1/root \ | systemctl --force exit 42 | ... | Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42. | $ echo $? | 42 Fixes systemd#1290

alban · 2015-09-21T11:15:15Z

For the record, this feature will be available in v227.

When a systemd service running in a container exits with a non-zero code, it can be useful to terminate the container immediately and get the exit code back to the host, when systemd-nspawn returns. This was not possible to do. This patch adds the following to make it possible: - Add a read-only "ExitCode" property on PID 1's "Manager" bus object. By default, it is 0 so the behaviour stays the same as previously. - Add a method "SetExitCode" on the same object. The method fails when called on baremetal: it is only allowed in containers or in user session. - Add support in systemctl to call "systemctl exit 42". It reuses the existing code for user session. - Add exit.target to the system instance. It has the following condition: ConditionVirtualization=container. - Change main() to actually call systemd-shutdown to exit() with the correct value. - Add verb 'exit' in systemd-shutdown with parameter --exit-code - Update systemctl manpage. I used the following to test it: | $ sudo rkt --debug --insecure-skip-verify run \ | --mds-register=false --local docker://busybox \ | --exec=/bin/chroot -- /proc/1/root \ | systemctl --force exit 42 | ... | Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42. | $ echo $? | 42 Fixes systemd#1290

When a systemd service running in a container exits with a non-zero code, it can be useful to terminate the container immediately and get the exit code back to the host, when systemd-nspawn returns. This was not possible to do. This patch adds the following to make it possible: - Add a read-only "ExitCode" property on PID 1's "Manager" bus object. By default, it is 0 so the behaviour stays the same as previously. - Add a method "SetExitCode" on the same object. The method fails when called on baremetal: it is only allowed in containers or in user session. - Add support in systemctl to call "systemctl exit 42". It reuses the existing code for user session. - Add exit.target and systemd-exit.service to the system instance. - Change main() to actually call systemd-shutdown to exit() with the correct value. - Add verb 'exit' in systemd-shutdown with parameter --exit-code - Update systemctl manpage. I used the following to test it: | $ sudo rkt --debug --insecure-skip-verify run \ | --mds-register=false --local docker://busybox \ | --exec=/bin/chroot -- /proc/1/root \ | systemctl --force exit 42 | ... | Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42. | $ echo $? | 42 Fixes systemd#1290

alban added nspawn systemctl new-feature labels Sep 17, 2015

alban mentioned this issue Sep 17, 2015

stage1: rework systemd service structure rkt/rkt#1407

Merged

poettering added RFE 🎁 Request for Enhancement, i.e. a feature request and removed new-feature labels Sep 18, 2015

alban self-assigned this Sep 18, 2015

alban added a commit to alban/systemd that referenced this issue Sep 18, 2015

systemd: return a non-zero exit code when requested

0295672

Fix systemd#1290

alban added a commit to alban/systemd that referenced this issue Sep 18, 2015

systemd: return a non-zero exit code when requested

2ca8bd2

Fix systemd#1290

alban mentioned this issue Sep 18, 2015

containers: systemd exits with non-zero code #1301

Closed

alban mentioned this issue Sep 20, 2015

containers: systemd exits with non-zero code #1310

Closed

alban mentioned this issue Sep 21, 2015

containers: systemd exits with non-zero code #1313

Closed

alban mentioned this issue Sep 21, 2015

containers: systemd exits with non-zero code #1318

Merged

alban closed this as completed in #1318 Sep 21, 2015

poettering added this to the v227 milestone Sep 21, 2015

ghost mentioned this issue Feb 3, 2020

Udev cannot send udp networkpaket #14756

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nspawn containers should be able to exit with a non-zero status #1290

nspawn containers should be able to exit with a non-zero status #1290

alban commented Sep 17, 2015

poettering commented Sep 17, 2015

yifan-gu commented Sep 17, 2015

yifan-gu commented Sep 18, 2015

poettering commented Sep 18, 2015

alban commented Sep 21, 2015

nspawn containers should be able to exit with a non-zero status #1290

nspawn containers should be able to exit with a non-zero status #1290

Comments

alban commented Sep 17, 2015

poettering commented Sep 17, 2015

yifan-gu commented Sep 17, 2015

yifan-gu commented Sep 18, 2015

poettering commented Sep 18, 2015

alban commented Sep 21, 2015