New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pid1: add a new method of rebooting: userspace only under the name "soft-reboot" #27435
base: main
Are you sure you want to change the base?
Conversation
poettering
commented
Apr 27, 2023
•
edited
edited
e86c071
to
cff890d
Compare
|
this is not ready for merging yet. lacks all docs, and tests. But is pretty comprehensive otherwise. Can do a supercharged reboot in an nspawn container and in qemu in no time. |
|
This is great, and should open the door to easier testing as well here, as we can now do reboots without actually terminating qemu/nspawn and thus add special logic to the test harness. Bikeshedding: I'd really call this just "userspace reboot", "renew" is really confusing if you don't already know what it does |
|
"systemctl userspace-reboot" is impossible to type |
|
That's why god invented bash completion! |
|
As a shorthand, we could also have a "systemctl uexec" alias - would pair well with kexec |
How about "reinit"? |
|
I prefer |
|
Maybe Btw, |
|
the main goal is to quickly transition to a new userspace snapshot, not just to re-execute the current one |
cff890d
to
7c1af51
Compare
|
You'll need to add diff --git a/test/units/testsuite-21.sh b/test/units/testsuite-21.sh
index 36f647ca5f..a0df607377 100755
--- a/test/units/testsuite-21.sh
+++ b/test/units/testsuite-21.sh
@@ -28,6 +28,7 @@ systemctl log-level info
# FIXME: systemd-run doesn't play well with daemon-reexec
# See: https://github.com/systemd/systemd/issues/27204
sed -i '/\[org.freedesktop.systemd1\]/aorg.freedesktop.systemd1.Manager:Reexecute FIXME' /etc/dfuzzer.conf
+sed -i '/\[org.freedesktop.systemd1\]/aorg.freedesktop.systemd1.Manager:Renew destructive' /etc/dfuzzer.conf
# TODO
# * check for possibly newly introduced buses?so dfuzzer doesn't keep triggering "renew" when fuzzing. Once this is merged (and the name of the method is settled) I'll update dfuzzer and drop the sed. |
btw, anyone has any idea where to find the current sources for launchd/launchtl? did apple take that closed source? only can find a verson from 7 years ago... |
apparently people don't like the name "renew". so we are going to change this before merging anyway, i guess |
7c1af51
to
95f781c
Compare
|
@poettering Out of curiosity, how would switching to a different root work in this case? In other words, how does a non-destructive EDIT: It's because it's essentially |
|
More bikeshedding suggestions:
Also the existing commands could have new, complementing aliases for symmetry:
|
Sadly they stopped releasing the source years ago. |
I know that at least one of the Linux kernel VFS maintainers is aware of the issue. I wouldn't hold my breath though.
ostree is a userspace concept. It cannot avoid this kernel limitation. This needs to be fixed in kernel. |
|
This should have been called |
|
|
needs a rebase |
This adds a new mechanism for rebooting, a form of "userspace reboot" hereby dubbed "soft-reboot". It will stop all services as in a usual shutdown, possibly transition into a new root fs and then issue a fresh initial transaction. The kernel is not replaced. File descriptors can be passed over, thus opening the door for leaving certain resources around between such reboots. Usecase: this is an extremely quick way to reset userspace fully when updating image based systems, without going through a full hardware/firmware/boot loader/kernel/initrd cycle. It minimizes "grayout time" for OS updates. (In particular when combined with kernel live patching)
Generally, if you specify |
|
It would indeed make sense, by forcibly restarting everything run by the user in question. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what to do about pcrphase and such - I think we do not want to re-run TPM measurements? To achieve this, and probably to help with other actions that might be needed to run only the "real" boot, what about adding a new ConditionIsSoftReboot= or ConditionBootType= or so, that allows to match and skip? To implement it, it should be doable to pass a variable through via serialization in the manager at shutdown, so that the next iteration knows it's been through a soft reboot
| if (!isempty(root)) { | ||
| if (!path_is_valid(root)) | ||
| return sd_bus_error_setf(error, SD_BUS_ERROR_INVALID_ARGS, | ||
| "New root directory must be a valid path."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include root in the log message?
| (such as the file system they are backed by), thus increasing memory usage (as two versions of the | ||
| OS/application/file system might be kept in memory). Leaving processes running during a soft-reboot | ||
| operation requires disconnecting the service comprehensively from the rest of the OS, i.e. minimizing IPC | ||
| and reducing sharing of resources with the rest of the OS.</para> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe plug in portable services and nspawn as solutions that work well with this, and link their docs?
|
|
||
| <para>Note that because | ||
| <citerefentry><refentrytitle>systemd-shutdown</refentrytitle><manvolnum>8</manvolnum></citerefentry> is | ||
| not executed the executables in <filename>/usr/lib/systemd/system-shutdown/</filename> are not executed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"not executed, the executables in"
| echo "wuffwuff" > "$T" | ||
| systemd-notify --fd=3 --pid=parent 3<"$T" | ||
| rm "$T" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to beef up this test, to add a normal service that is shut down as expected (and verify that it is), and one that survives (and verify that it does)
You can log out and log back in. That should get you what you want. Userspace reboots should probably be under the normal reboot commands. |
So that's a major discussion to be had, but I think we should do that separately. That said I nowadays think the only safe and secure way is that we really should run at least the boot phase stuff, so that the n-th reboot can be securely distinguished from the n-1-th and n+1-th boot. But we need to start maintaining a proper log of measurements sooner rather than later, so that people can verify this. Soft reboots must appear as a new series of boot phases in the measurement logs I think. Note that the keys for FDE and such are already unlocked on a soft reboot, and we don't need to unlock them again, hence the boot phase stuff should work out quite OK still: you'd bind FDE unlocking to the initrd boot phase of the first boot, and then the FDE will work in perpetuity even though we can never access the key in the TPM anymore. A different story is the measurement of machine-id and the mount uuids into PCR 15. We probably should not repeat that, and we don't really have to if it stays the same and mounted.
We can certainly consider that. For the fs uuid measurement an alternative would be to simply leave the service up till the very end, so that on next boot it is still up because serialization, and then we won't rerun it. |
As far as we (the Darling project) know, launchd has been integrated into the XPC project, largely rewritten, and its sources are nowhere to be found just like the rest of XPC. We ship launchd-842.92.1 (apparently from OS X 10.9.4, 2014) and our own reimplementation of XPC. |
|
Is there a defined interface for the new userspace to find out how was booted ("normally" or via soft-reboot)? Similarly, where it was booted from, analogous to /proc/cmdline (besides inspecting mountinfo in proc)? Or would the old userspace prepare that information together with I've not seen it in the code, but perhaps I've missed something. |
Could you explain what is meant by image based operating system? |
|
@brotaxt For example, using two read-only SquashFS partitions (the disk "images"), one with the current root, and the other with the updated root to reboot into |
| @@ -129,6 +129,11 @@ Deprecations and removals: | |||
|
|
|||
| Features: | |||
|
|
|||
| * refuse using the switch-reboot operation without /etc/initrd-release. Now | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you meant switch-root, not switch-reboot