Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nspawn containers should be able to exit with a non-zero status #1290

Closed
alban opened this issue Sep 17, 2015 · 5 comments
Closed

nspawn containers should be able to exit with a non-zero status #1290

alban opened this issue Sep 17, 2015 · 5 comments
Assignees
Labels
nspawn RFE 🎁 Request for Enhancement, i.e. a feature request systemctl
Milestone

Comments

@alban
Copy link
Member

alban commented Sep 17, 2015

I have a nspawn container running systemd as pid1, and it runs several services. I would like that if any service fails, the container terminates and nspawn returns an non-zero exit code ($? in the shell).

I can configure the services to run halt.target on failure and the container will shutdown. But systemd-nspawn will always return the exit code 0. At the moment, this is not possible to return a different value when a service in the container fails.

The way it currently works in the container is:

  • some service calls systemctl halt
  • systemd calls systemd-shutdown
  • systemd-shutdown calls the system call reboot()
  • since we are not in the initial pid namespace, the kernel will not reboot the computer but instead send SIGINT to pid1 (systemd) pretend that SIGINT is sent to pid1 (systemd). The systemd process and all processes in the container get terminated.
  • systemd-nspawn notices that systemd was killed by SIGINT and understands that as a normal shutdown
  • systemd-nspawn returns the exit code 0

After discussion on irc, it could be implemented in this way:

  • adds ExitCode as property to PID 1's "Manager" bus object
  • adds a SetExitCode() bus call to set it on the same object, and that returns an error on baremetal, thus allowing early failure
  • adds support for setting this to "systemctl exit 55"
  • adds the exit.target + exit.service units to the system instance
  • changes systemd-shutdown to actually exit with the configured exit code

This feature request was inspired by rkt/rkt#1407 (comment)

/cc @yifan-gu @iaguis @poettering

@poettering
Copy link
Member

Looks good to me.

Note that the systemd --user instance does not invoke systemd-shutdown but exits right-away. Of course, the exit code should work for that too.

@yifan-gu
Copy link

SGTM, thanks @alban !

@yifan-gu
Copy link

btw, I sent SIGINT to the systemd process in the nspawn. The nspawn does reboot.

@poettering
Copy link
Member

@yifan-gu yeah, sending SIGINT to PID 1 results in reboot on SysV. The kernel actually never sends SIGINT to the PID1, but instead just reports to nspawn a reboot() called in the container as "killed by signal SIGINT"...

@poettering poettering added RFE 🎁 Request for Enhancement, i.e. a feature request and removed new-feature labels Sep 18, 2015
@alban alban self-assigned this Sep 18, 2015
alban added a commit to alban/systemd that referenced this issue Sep 18, 2015
alban added a commit to alban/systemd that referenced this issue Sep 18, 2015
alban added a commit to alban/systemd that referenced this issue Sep 18, 2015
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:

- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
  By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
  called on baremetal: it is only allowed in containers or in user
  session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
  existing code for user session.
- Add exit.target to the system instance. It has the following
  condition: ConditionVirtualization=container.
- Change main() to actually allow exit() with the correct value.
- Update systemctl manpage.

I used the following to test it:

| $ sudo rkt --debug --insecure-skip-verify run \
|            --mds-register=false --local docker://busybox \
|            --exec=/bin/chroot -- /proc/1/root \
|            systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42

I don't know why I have to use --force.

Fixes systemd#1290
alban added a commit to alban/systemd that referenced this issue Sep 18, 2015
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:

- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
  By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
  called on baremetal: it is only allowed in containers or in user
  session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
  existing code for user session.
- Add exit.target to the system instance. It has the following
  condition: ConditionVirtualization=container.
- Change main() to actually allow exit() with the correct value.
- Update systemctl manpage.

I used the following to test it:

| $ sudo rkt --debug --insecure-skip-verify run \
|            --mds-register=false --local docker://busybox \
|            --exec=/bin/chroot -- /proc/1/root \
|            systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42

I don't know why I have to use --force.

Fixes systemd#1290
alban added a commit to alban/systemd that referenced this issue Sep 20, 2015
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:

- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
  By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
  called on baremetal: it is only allowed in containers or in user
  session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
  existing code for user session.
- Add exit.target to the system instance. It has the following
  condition: ConditionVirtualization=container.
- Change main() to actually call systemd-shutdown to exit() with the
  correct value.
- Add verb 'exit' in systemd-shutdown with parameter --exit-code
- Update systemctl manpage.

I used the following to test it:

| $ sudo rkt --debug --insecure-skip-verify run \
|            --mds-register=false --local docker://busybox \
|            --exec=/bin/chroot -- /proc/1/root \
|            systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42

Fixes systemd#1290
alban added a commit to alban/systemd that referenced this issue Sep 21, 2015
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:

- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
  By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
  called on baremetal: it is only allowed in containers or in user
  session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
  existing code for user session.
- Add exit.target to the system instance. It has the following
  condition: ConditionVirtualization=container.
- Change main() to actually call systemd-shutdown to exit() with the
  correct value.
- Add verb 'exit' in systemd-shutdown with parameter --exit-code
- Update systemctl manpage.

I used the following to test it:

| $ sudo rkt --debug --insecure-skip-verify run \
|            --mds-register=false --local docker://busybox \
|            --exec=/bin/chroot -- /proc/1/root \
|            systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42

Fixes systemd#1290
@alban
Copy link
Member Author

alban commented Sep 21, 2015

alban added a commit to alban/systemd that referenced this issue Sep 21, 2015
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:

- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
  By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
  called on baremetal: it is only allowed in containers or in user
  session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
  existing code for user session.
- Add exit.target to the system instance. It has the following
  condition: ConditionVirtualization=container.
- Change main() to actually call systemd-shutdown to exit() with the
  correct value.
- Add verb 'exit' in systemd-shutdown with parameter --exit-code
- Update systemctl manpage.

I used the following to test it:

| $ sudo rkt --debug --insecure-skip-verify run \
|            --mds-register=false --local docker://busybox \
|            --exec=/bin/chroot -- /proc/1/root \
|            systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42

Fixes systemd#1290
alban added a commit to alban/systemd that referenced this issue Sep 21, 2015
When a systemd service running in a container exits with a non-zero
code, it can be useful to terminate the container immediately and get
the exit code back to the host, when systemd-nspawn returns. This was
not possible to do. This patch adds the following to make it possible:

- Add a read-only "ExitCode" property on PID 1's "Manager" bus object.
  By default, it is 0 so the behaviour stays the same as previously.
- Add a method "SetExitCode" on the same object. The method fails when
  called on baremetal: it is only allowed in containers or in user
  session.
- Add support in systemctl to call "systemctl exit 42". It reuses the
  existing code for user session.
- Add exit.target and systemd-exit.service to the system instance.
- Change main() to actually call systemd-shutdown to exit() with the
  correct value.
- Add verb 'exit' in systemd-shutdown with parameter --exit-code
- Update systemctl manpage.

I used the following to test it:

| $ sudo rkt --debug --insecure-skip-verify run \
|            --mds-register=false --local docker://busybox \
|            --exec=/bin/chroot -- /proc/1/root \
|            systemctl --force exit 42
| ...
| Container rkt-895a0cba-5c66-4fa5-831c-e3f8ddc5810d failed with error code 42.
| $ echo $?
| 42

Fixes systemd#1290
@poettering poettering added this to the v227 milestone Sep 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nspawn RFE 🎁 Request for Enhancement, i.e. a feature request systemctl
Development

No branches or pull requests

3 participants