Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed resume does not clear HibernateLocation EFI variable #32021

Closed
kleinph opened this issue Mar 30, 2024 · 1 comment · Fixed by #32043
Closed

Failed resume does not clear HibernateLocation EFI variable #32021

kleinph opened this issue Mar 30, 2024 · 1 comment · Fixed by #32043
Assignees
Labels
bug 🐛 Programming errors, that need preferential fixing hibernate-resume

Comments

@kleinph
Copy link

kleinph commented Mar 30, 2024

systemd version the issue has been seen with

255.4-2

Used distribution

Arch Linux

Linux kernel version used

6.8.2-arch2-1

CPU architectures issue was seen on

x86_64

Component

systemd

Expected behaviour you didn't see

Clearing resume state, so that after a failed resume from hibernate attempt no further failed ones occur.

Unexpected behaviour you saw

I tried to hibernate with a Luks2 encrypted swap, which is not supported when only using gpt-auto detection according to #30557.

So resume times out and fails with error:

Timed out waiting for device /dev/disk/by-uuid/<UUID-of-swap>.

But the HibernateLocation EFI variable is not cleared and after booting and rebooting, the time out and error appear again.

To stop the failing resume attempts the EFI varaible must be cleared manually as described in #30395 (comment).

Steps to reproduce the problem

  • Hibernate with an unsupported setup (like Luks2 encrypted swap without crypttab).
  • Resume from hibernation.
  • Resume fails as expected
  • Reboot
  • Resume is attempted again and fails again. Not expected.

Additional program output to the terminal or log subsystem illustrating the issue

No response

@kleinph kleinph added the bug 🐛 Programming errors, that need preferential fixing label Mar 30, 2024
@github-actions github-actions bot added the pid1 label Mar 30, 2024
@YHNdnzj YHNdnzj self-assigned this Mar 30, 2024
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 1, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 1, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 1, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
@YHNdnzj
Copy link
Member

YHNdnzj commented Apr 1, 2024

So apparently people encounter this at a much higher rate than I originally expected... Fix is waiting in #32043.

YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 2, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 2, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 2, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 3, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 3, 2024
that clears stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
new systemd-hibernate-resume-clear-efi.service unit,
that only runs after switch-root.

Fixes systemd#32021
YHNdnzj added a commit to YHNdnzj/systemd that referenced this issue Apr 3, 2024
stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
unit that only runs after switch-root and clears the var.

Fixes systemd#32021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing hibernate-resume
2 participants