Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dracut-shutdown): add cleanup handler on failure #1689

Merged
merged 1 commit into from Feb 2, 2022

Conversation

rmetrich
Copy link
Contributor

@rmetrich rmetrich commented Jan 13, 2022

It may happen that dracut-shutdown.service fails, for example on timeout due to very low bandwidth.
In such case, for hardening purposes, a new dracut-shutdown-onfailure.service unit doing dracut-shutdown.service cleanup needs to execute to make sure switching root to an incomplete initramfs won't occur later.

See also RHBZ #1924587.

This pull request changes...

Changes

Checklist

  • I have tested it locally
  • I have reviewed and updated any documentation if relevant
  • I am providing new code and test(s) for it

Fixes #

@github-actions github-actions bot added dracut-systemd Issues related to the dracut-systemd module modules Issue tracker for all modules labels Jan 13, 2022
@rmetrich rmetrich marked this pull request as draft January 13, 2022 16:39
@rmetrich rmetrich changed the title DRAFT: fix(dracut-shutdown): add cleanup handler on failure fix(dracut-shutdown): add cleanup handler on failure Jan 13, 2022
rmetrich added a commit to rmetrich/plymouth that referenced this pull request Jan 13, 2022
…ter dracut-shutdown-onfailure.service

It may happen that dracut-shutdown.service fails, for example on timeout
due to very low bandwidth.
In such case, for hardening purposes, a new dracut-shutdown-onfailure.service
unit doing dracut-shutdown.service cleanup needs to execute first, which will
ensure switching root to an incomplete initramfs wont't occur.

See related dracut PR #1689 (dracutdevs/dracut#1689).

Signed-off-by: Renaud Métrich <rmetrich@redhat.com>
rmetrich added a commit to rmetrich/plymouth that referenced this pull request Jan 13, 2022
…ter dracut-shutdown-onfailure.service

It may happen that dracut-shutdown.service fails, for example on timeout
due to very low bandwidth.
In such case, for hardening purposes, a new dracut-shutdown-onfailure.service
unit doing dracut-shutdown.service cleanup needs to execute first, which will
ensure switching root to an incomplete initramfs doesn't occur.

See related dracut PR #1689 (dracutdevs/dracut#1689).

Signed-off-by: Renaud Métrich <rmetrich@redhat.com>
@rmetrich
Copy link
Contributor Author

See also related plymouth PR https://github.com/freedesktop/plymouth/pull/6
(plymouth must run after dracut-shutdown-on-failure when the latter executes)

@rmetrich rmetrich marked this pull request as ready for review January 14, 2022 07:37
@rmetrich
Copy link
Contributor Author

rmetrich commented Jan 14, 2022

Serial console output on success (similar to current code):

[  OK  ] Stopped Restore /run/initramfs on shutdown.
         Starting Tell Plymouth To Jump To initramfs...
[  OK  ] Stopped target Local File Systems.
         Unmounting /boot...
[  OK  ] Started Tell Plymouth To Jump To initramfs.
[  OK  ] Unmounted /boot.

Serial console output on failure (no jump to initramfs):

[  OK  ] Stopped Restore /run/initramfs on shutdown.
         Starting Service executing upon dra…down failure to perform cleanup...
[  OK  ] Stopped target Local File Systems.
         Unmounting /boot...
[  OK  ] Started Service executing upon drac…utdown failure to perform cleanup.
[  OK  ] Unmounted /boot.

@johannbg
Copy link
Collaborator

@rmetrich looking at that bz.rh report I think you should do two things before proceeding further a) try to find what's the actual root cause for the unpacking failing in the first place and b) try to duplicate it on a distribution which has more modern core/baseOS stack since RHEL releases are made out of outdate core/base OS components.

@rmetrich
Copy link
Contributor Author

I agree RHEL is quite behind but IMHO hardening is still valuable.

I saw this recently on HP hardware which was sending burst of Ctrl-Alt-Del. There was a PR to harden this which is now upstream (commit #b9ba3c8bb8f0f1328cd1ffaa8dbf64585b28c474).
More generally hardware issues may happen that lead to getting to emergency prompt, I see this regularly, e.g. Virtual DVD being unmounted due to iLO timeout while shutdown of installation occurs.

@johannbg
Copy link
Collaborator

With regards to the dvd/ilo issue is not the issue there that the admin has not increased the idle timeout in the firmware? In properly designed OS it needs to be consistent throughout the OS ( which is not the case in Fedora/rhel ) and the relevant hw firmware as well.

@johannbg johannbg enabled auto-merge (rebase) January 15, 2022 06:15
auto-merge was automatically disabled January 17, 2022 06:58

Head branch was pushed to by a user without write access

It may happen that dracut-shutdown.service fails, for example on timeout
due to very low bandwidth.
In such case, for hardening purposes, a new dracut-shutdown-onfailure.service
unit doing dracut-shutdown.service cleanup needs to execute to make sure
switching root to an incomplete initramfs won't occur later.

See also RHBZ #1924587 (https://bugzilla.redhat.com/show_bug.cgi?id=1924587).
@johannbg johannbg enabled auto-merge (rebase) January 17, 2022 07:23
@johannbg johannbg merged commit 7ab1d00 into dracutdevs:master Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dracut-systemd Issues related to the dracut-systemd module modules Issue tracker for all modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants