Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for soft-reboot #309911

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

add support for soft-reboot #309911

wants to merge 4 commits into from

Conversation

arianvp
Copy link
Member

@arianvp arianvp commented May 7, 2024

Description of changes

Usage:

sudo nixos-rebuild boot && sudo systemctl soft-reboot

switch-to-configuration boot will now prepare a symlink /run/next-system
and just before soft-reboot we will automatically call
/run/next-system/activate to activate the next system to be booted when
systemctl soft-reboot is called.

not only can soft-reboot can be used as an alternative to kexec and reboot, but
it can also be used as an alternative to nixos-rebuild switch in many cases.

Unlike kexec and reboot, soft-reboot can keep certain resources around across the reboot.
For example, socket units can be configured to stay around such that reboots can happen
without stopping accepting any connections. Furthermore file descriptors can be passed across
boots using FDSTORE to even keep existing connections intact.

Please see the man-page for more info: https://www.freedesktop.org/software/systemd/man/latest/systemd-soft-reboot.service.html#

from Systemd NEWS:

A "soft reboot" is similar to a regular reboot, except that it affects
userspace only: the service manager shuts down any running services and other
units, then optionally switches into a new root file system (mounted to
/run/nextroot/), and then passes control to a systemd instance in the new file
system which then starts the system up again. The kernel is not rebooted and
neither is the hardware, firmware or boot loader. This provides a fast,
lightweight mechanism to quickly reset or update userspace, without the latency
that a full system reset involves. Moreover, open file descriptors may be
passed across the soft reboot into the new system where they will be passed
back to the originating services. This allows pinning resources across the
reboot, thus minimizing grey-out time further

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@arianvp arianvp force-pushed the soft-reboot branch 4 times, most recently from b8c0bb7 to 57e4d6c Compare May 7, 2024 19:25
@arianvp

This comment was marked as resolved.

@arianvp

This comment was marked as resolved.

@arianvp arianvp force-pushed the soft-reboot branch 2 times, most recently from fc6c978 to b619ed4 Compare May 7, 2024 21:01
@arianvp arianvp marked this pull request as ready for review May 7, 2024 21:09
@arianvp arianvp requested review from a team and dasJ as code owners May 7, 2024 21:09
@arianvp
Copy link
Member Author

arianvp commented May 7, 2024

This needs a VM test. Also this probably can not be merged anymore until the 24.05 cutoff has happened.

> A "soft reboot" is similar to a regular reboot, except that it affects
> userspace only: the service manager shuts down any running services and other
> units, then optionally switches into a new root file system (mounted to
> /run/nextroot/), and then passes control to a systemd instance in the new file
> system which then starts the system up again. The kernel is not rebooted and
> neither is the hardware, firmware or boot loader. This provides a fast,
> lightweight mechanism to quickly reset or update userspace, without the latency
> that a full system reset involves. Moreover, open file descriptors may be
> passed across the soft reboot into the new system where they will be passed
> back to the originating services. This allows pinning resources across the
> reboot, thus minimizing grey-out time further.

not only can soft-reboot can be used as an alternative to kexec and reboot, but
it can also be used as an alternative to `nixos-rebuild switch` in many cases.

Unlike kexec and reboot, soft-reboot can keep certain resources around across the reboot.
For example, socket units can be configured to stay around such that reboots can happen
without dropping any connections. Furthermore file descriptors can be passed across
boots using FDSTORE to even keep existing connections intact.

Please see the man-page for more info: https://www.freedesktop.org/software/systemd/man/latest/systemd-soft-reboot.service.html#
switch-to-configuration boot will now create a symlink /run/next-system

systemctl soft-reboot will call /run/next-system/activate just before
soft-rebooting, making sure that the system we're rebooting into is activated.

Thus you can do:

nixos-rebuild boot && systemctl soft-reboot

now
We want to get rid of specialFileSystems / earlyMountScript eventually and
there is no need to run this before systemd anymore now that
the wrappers themselves are set up in a systemd unit since NixOS#263203

Also this is needed to make soft-reboot work. We want to make sure
that we remount /run/wrappers with the nosuid bit removed on soft-reboot
but because @earlyMountScript@ happens in initrd, this wouldn't happen
@arianvp
Copy link
Member Author

arianvp commented May 7, 2024

Neat follow-up; now that soft-reboot can be used as an alternative to switch-to-configuration switch

Make a version of switch-to-configuration.pl that is:

  1. Not perl
  2. Only sets /nix/var/nix/profiles/system and /run/next-system and nothing else
  3. installs bootloader entries
  4. throws away all the systemd service start/stopping code as it's not needed anymore

@arianvp arianvp changed the title nixos/systemd: add support for soft-reboot add support for soft-reboot May 7, 2024
@@ -110,6 +110,8 @@
}

if ($action eq "boot") {
# /run/next-system/activate is called when systemctl soft-reboot is called
system("@coreutils@/bin/ln", "-sfn", "$toplevel", "/run/next-system") == 0 or exit 1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this on "switch" too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem necessary. The point of next-system is to have a link to activate on soft-reboot, and a switch will have already activated the new system. Even on a soft-reboot, reactivation shouldn't be necessary.

DefaultDependencies = false;
ConditionPathExists = "/run/next-system/activate";
};
serviceConfig.ExecStart = "/run/next-system/activate";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this fall back to /run/current-system/activate if /run/next-system/activate doesn't exist?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same argument as not creating next-system on switch; the current system is already activated so it shouldn't be necessary.

This fixes the long standing issue NixOS#50300
# make sure /run/next-system isn't garbage collected before the next boot
systemd.tmpfiles.rules = [ "L+ /nix/var/nix/gcroots/next-system - - - - /run/next-system" ];

# if /run/next-system exists, activate it before rebooting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Member Author

@arianvp arianvp May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Activation happens in the initrd (at least if systemd-initrd is enabled) which soft-reboot doesn't transition into. Same reason why we do activation in the initrd before transitioning into stage-2.

If we don't do this the new system will never get activated

@NickCao
Copy link
Member

NickCao commented May 8, 2024

It seems the minimal required part of this PR is the two additional upstream units and the wrapper mount point. It's enough for the nixos-rebuild boot then systemctl soft-reboot workflow to work as nixos-rebuild boot already links the system profile to /nix/var/nix/profiles/system. There are no need for the /run/next-system shenanigans.

@arianvp
Copy link
Member Author

arianvp commented May 8, 2024

It seems the minimal required part of this PR is the two additional upstream units and the wrapper mount point. It's enough for the nixos-rebuild boot then systemctl soft-reboot workflow to work as nixos-rebuild boot already links the system profile to /nix/var/nix/profiles/system. There are no need for the /run/next-system shenanigans.

Using /nix/var/nix/profiles/system is incorrect. See long-standing issue #50300 (which the last commit from this PR fixes; but can maybe be split up into a new PR)

/nix/var/nix/profiles/system is not guaranteed to be updated when nixos-rebuild boot is run due to the existence of alternative system profiles and specialisations

@arianvp arianvp marked this pull request as draft May 8, 2024 08:34
@arianvp
Copy link
Member Author

arianvp commented May 8, 2024

Moving this back to draft. This needs to cook a bit more.

But just keeping this PR around for notes and feedback

I just realized a pretty serious defect:

soft-reboot re-execs into SYSTEMD_BINARY_PATH (Or into /run/nextroot/$SYSTEMD_BINARY_PATH if it exists) (https://github.com/systemd/systemd/blob/18303adcd3a9b16339008b5fa909a7c09247072d/src/core/main.c#L1942)

but we set SYSTEMD_BINARY_PATH=/run/current-system/systemd/lib/systemd https://github.com/NixOS/nixpkgs/blob/master/pkgs/os-specific/linux/systemd/default.nix#L799

which means soft-reboot doesn't execute the new systemd version but into the old one.

Maybe we do need to do the /run/nextroot population instead

@arianvp arianvp removed request for a team and dasJ May 8, 2024 08:52
@arianvp
Copy link
Member Author

arianvp commented May 8, 2024

Wait no.. We call /run/next-sytem/activate just before the soft-reboot which creates the /run/current-system symlink! This actually already works. 🥳

However I want to have a NixOS Test that asserts this

echo "Loading NixOS system via kexec."
exec kexec --load $p/kernel --initrd=$p/initrd --append="$(cat $p/kernel-params) init=$p/init"
fi
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should mess with kexec at all in this PR. It's actually got some problems as is and this is only going to make it more confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh I'll drop the commit (and maybe open a separate or for it)

systemd.tmpfiles.rules = [ "L+ /nix/var/nix/gcroots/next-system - - - - /run/next-system" ];

# if /run/next-system exists, activate it before rebooting
systemd.services.systemd-soft-reboot-activate-next-system = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to doing this in ExecStartPre in systemd-soft-reboot.service with a - in the front of the command? That seemed quite elegant to me. With a new unit, I'm less sure that the unit ordering is right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the ordering from prepare-kexec. I'm pretty confident this is right.

Using ExecStartPre= It would complain about it not being able to connect to the journald.

May 07 21:11:02 utm (activate)[6377]: systemd-soft-reboot.service: Failed to connect stdout to the journal socket, ignoring: Connection refused

soft-reboot.target causes systemd-journald.service to exit before it is reached (has a Conflicts on it) to make sure logs are properly flushed before the switch is made. I think that's what causing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the ordering from prepare-kexec isn't exactly relevant, because it isn't really ordered against anything at all except kexec itself. I would think we would want activation to happen as late as possible, i.e. after as much as possible has shut down. Maybe this doesn't matter; it would just mimic an actual reboot better if activation happened when everything was shut down.

@@ -110,6 +110,8 @@
}

if ($action eq "boot") {
# /run/next-system/activate is called when systemctl soft-reboot is called
system("@coreutils@/bin/ln", "-sfn", "$toplevel", "/run/next-system") == 0 or exit 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem necessary. The point of next-system is to have a link to activate on soft-reboot, and a switch will have already activated the new system. Even on a soft-reboot, reactivation shouldn't be necessary.

DefaultDependencies = false;
ConditionPathExists = "/run/next-system/activate";
};
serviceConfig.ExecStart = "/run/next-system/activate";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same argument as not creating next-system on switch; the current system is already activated so it shouldn't be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants