Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-sleep: always attempt hibernation if configured #14270

Merged

Conversation

zachsmith
Copy link
Contributor

@zachsmith zachsmith commented Dec 7, 2019

When calculation of swap file offset is unsupported, rely on the /sys/power/resume & /sys/power/resume_offset values rather than requiring a matching swap entry to be identified.

Refactor to use dev_t for comparison of resume= device instead of string.

Fixes #14249

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from ffe5cee to 0f015c0 Compare December 7, 2019 01:09
@zachsmith zachsmith changed the title systemd-sleep: fix hibernation regression on btrfs systemd-sleep: fix hibernation regression on Btrfs Dec 7, 2019
@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch 2 times, most recently from 37349d5 to d9774e7 Compare December 7, 2019 01:37
Copy link
Contributor

@diabonas diabonas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Btrfs the comparison between the device ID from stat

swap_dev = streq(swap->type, "partition") ? sb.st_rdev : sb.st_dev;
and the device ID from /sys/power/resume
r = read_one_line_file("/sys/power/resume", &resume);
does not work, see #14249 (comment). Therefore this PR does not allow for hibernation on my system (it might be further complicated by the fact that my Btrfs file system is encrypted using LUKS, I will be able to test a non-encrypted file system tomorrow at the earliest).

I think the easiest fix would be to record in the /proc/swaps parsing loop whether a swap file on a Btrfs file system has been found (i.e. whether resume_offset is 0 for one of the entries) and change

if (!streq(sys_resume, "0:0") && !location_is_resume_device(hibernate_location, sys_resume, sys_offset))
return log_warning_errno(SYNTHETIC_ERRNO(ENOSYS), "/sys/power/resume and /sys/power/resume_offset has no matching entry in /proc/swaps; Hibernation will fail: resume=%s, resume_offset=%" PRIu64,
sys_resume, sys_offset);
to produce only a warning in this case, but to continue with the selected swap entry anyway (like systemd did before #13682).

@yuwata yuwata added the sleep label Dec 7, 2019
@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch 3 times, most recently from f048cce to fc51f44 Compare December 7, 2019 23:57
@lgtm-com
Copy link

lgtm-com bot commented Dec 8, 2019

This pull request introduces 1 alert when merging fc51f44 into debda5a - view on LGTM.com

new alerts:

  • 1 for Comparison result is always the same

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch 2 times, most recently from ab919a1 to 814e467 Compare December 8, 2019 00:34
@lgtm-com
Copy link

lgtm-com bot commented Dec 8, 2019

This pull request introduces 1 alert when merging 814e467 into debda5a - view on LGTM.com

new alerts:

  • 1 for Comparison result is always the same

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 814e467 to eb4c81f Compare December 8, 2019 01:10
@lgtm-com
Copy link

lgtm-com bot commented Dec 8, 2019

This pull request introduces 1 alert when merging eb4c81f into debda5a - view on LGTM.com

new alerts:

  • 1 for Comparison result is always the same

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch 2 times, most recently from ebedc31 to 1826278 Compare December 8, 2019 03:47
break;
}
}

if (!hibernate_location)
return log_debug_errno(SYNTHETIC_ERRNO(ENOSYS), "No swap partitions or files were found");

if (!streq(sys_resume, "0:0") && !location_is_resume_device(hibernate_location, sys_resume, sys_offset))
int resume_match = location_is_resume_device(hibernate_location, sys_resume, sys_offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only check whether the selected swap entry (i.e. the "highest priority swap with most remaining space") is on a Btrfs file system. This fails in case

  • there are multiple swap entries and
  • the hibernation location is a Btrfs swap file (hence cannot be found by location_is_resume_device in the parsing loop), but
  • the selected swap entry is not on a Btrfs file system.

Copy link
Contributor Author

@zachsmith zachsmith Dec 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would only happen if you had a swap file on a non-Btfs fs configured with a higher priority than the swap entry for resume= and resume_offset=.

I wonder if what we should do instead is to bypass the location detection if SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK is set because if the user has that set it seems to imply that they have their swaps configured correctly. In this case, find_hibernate_location would just return the values for resume= and resume_offset= or return an error if they are not set.

Copy link
Contributor Author

@zachsmith zachsmith Dec 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diabonas, I pushed up a new change that will allow SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK to bypass the location detection if the user has correctly configured /sys/power/resume (and by extension, /sys/power/resume_offset when using a swapfile).

assert(location);
assert(location->resume);
assert(sys_resume);

return streq(sys_resume, location->resume) && sys_offset == location->resume_offset;
bool resume_is_set = !streq(sys_resume, "0:0");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, can't we change thing so that instead of string comparing sys_resume with "0:0" everywhere we just say that sys_resume set to NULL means what "0:0" currently means? i.e. in read_resume_files() already filter our "0:0" and set *ret_resume to NULL when we see it.

Propagating this magic "I am not set" string through all layers is just weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thinking about this, read_resume_files() should probably change ret_resume to be of type dev_t, and use parse_dev() to parse it to dev_t directly? i.e. we should parse where we read the data, and then prpoagate parsed data, instead of literal strings

*/
if (getenv_bool("SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK") > 0) {
if (streq(sys_resume, "0:0"))
return log_warning_errno(SYNTHETIC_ERRNO(ENOSYS),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation off

if (resume_match >= 0) {
log_debug("Hibernation will attempt to use swap entry with path: %s, device: %s, offset: %" PRIu64 ", priority: %i",
hibernate_location->swap->device, hibernate_location->resume, hibernate_location->resume_offset, hibernate_location->swap->priority);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no {} around single-line if blocks according to our coding style

Copy link
Contributor

@diabonas diabonas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version of this PR works for my LUKS-encrypted Btrfs swap file when enabling SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1.

I'm not sure about the purpose of this environment variable though: to me it would make more sense if we would record if location_is_resume_device returned -1/maybe for any of the entries in /proc/swaps and use the swap defined by sys_resume/sys_offset in this case (probably with a warning message that it wasn't possible to check the free space). After all, this is just an additional safety check that is impossible to fulfil with the current implementation, why make users jump through additional hoops to enable the use of their explicitly configured swap device?

@@ -421,6 +496,13 @@ static bool enough_swap_for_hibernation(void) {
if (r < 0)
return false;

/* in cases where we were not able to identify available swap information, log a message with override instructions and fail */
if (r > 0 && hibernate_location->swap == NULL) {
log_debug("Unable to determine available swap space for resume=%s, resume_offset=%" PRIu64 ";set SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1 to override this check.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be log_warning, otherwise most people will never see this information.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. It would be great to make unable to check the space as a warning while confirmed insufficient space as error. After all if hibernation failed, the system should continue to function without corruption.

src/sleep/sleep.c Outdated Show resolved Hide resolved
@poettering
Copy link
Member

looks pretty ok, but the %lu thing looks suspicious?

@poettering
Copy link
Member

and thanks for the enum and dev_t rework, this looks a lot better now!

@poettering
Copy link
Member

ci didn't like it

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 5b5cde5 to 56a82b2 Compare December 13, 2019 15:57
@zachsmith
Copy link
Contributor Author

Works fine on my LUKS-encrypted system with a swap file on a Btrfs partition when setting SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK=1 in /etc/systemd/system/systemd-logind.service.d/override.conf, thanks :) +1

Awesome! Glad to hear it worked with your LUKS setup. 👍

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 8b359b0 to 989fbb1 Compare December 15, 2019 01:27
Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the end, I think the best course of action is to give Btrfs users the ability to override find_hibernate_location - if they have /sys/power/resume and /sys/power/resume_offset configured correctly, hibernation should work (unless there isn't enough space).

Yes! I started working on a review yesterday, and wrote something to the same effect. Please squash/rebase the commits. The comments below might be outdated already.

typedef enum {
UNMATCHED = 0,
MATCHED = 1,
UNKNOWN = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESUME_UNMATCHED, RESUME_MATCHED, RESUME_UNKOWN.
And put a comma after the last definition too, no need to make it special.

return log_warning_errno(SYNTHETIC_ERRNO(ENOSYS),
"SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK is set but /sys/power/resume is not configured; hibernation will fail");
if (sys_resume == 0)
return log_warning_errno(SYNTHETIC_ERRNO(ENOSYS),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chunk should go the first commit.

@@ -310,6 +328,19 @@ int find_hibernate_location(HibernateLocation **ret_hibernate_location) {
if (r < 0)
return r;

/* Users of swapfiles on Btrfs should override swap discovery because
* systemd will be unable to correctly calculate offsets.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this recommendation is good. If we cannot determine if the offset if specified correctly because of unsupported fs type, we should just assume it is. There is absolutely no point in asking the users to set a variable, since the result will be exactly the same (we will accept the value), except that the user has to do a manual step.

@keszybz
Copy link
Member

keszybz commented Dec 15, 2019

Oh, I read your comment in the issue, and I see that I misunderstood what you are saying. If the user uses brtrfs swapfile, we should just assume that the configuration is correct in the part that we cannot verify, and check the parts that we can verify. No point in asking the user to set a variable.

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 989fbb1 to 928541d Compare December 15, 2019 16:12
@zachsmith zachsmith changed the title systemd-sleep: fix hibernation regression on Btrfs systemd-sleep: remove matching swap requirement for resume= Dec 15, 2019
@zachsmith
Copy link
Contributor Author

Oh, I read your comment in the issue, and I see that I misunderstood what you are saying. If the user uses brtrfs swapfile, we should just assume that the configuration is correct in the part that we cannot verify, and check the parts that we can verify. No point in asking the user to set a variable.

Thanks, @keszybz. I think this should be ready to go unless there is more feedback that needs to be incorporated.

@@ -39,18 +39,19 @@ STATIC_DESTRUCTOR_REGISTER(arg_verb, freep);

static int write_hibernate_location_info(const HibernateLocation *hibernate_location) {
char offset_str[DECIMAL_STR_MAX(uint64_t)];
char resume_str[DECIMAL_STR_MAX(dev_t)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i'd size this DECIMAL_STR_MAX(unsigned) * 2 + STRLEN(":") or so

@poettering
Copy link
Member

@poettering migrating the resume device comparison from string to dev_t re-exposed the fact that Btrfs will not return the correct dev_t from stat.

Note that we have a helper to deal with that: get_block_device() and btrfs_get_block_device_fd() which have specil knowledge of btrfs ioctls to retrieve the backing block device.

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 928541d to df70f92 Compare December 17, 2019 15:46
@zachsmith
Copy link
Contributor Author

Note that we have a helper to deal with that: get_block_device() and btrfs_get_block_device_fd() which have specil knowledge of btrfs ioctls to retrieve the backing block device.

Thanks @poettering - changed swap_device_to_major_minor to use get_block_device in the case of a swap file.

Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry, I think the code is still trying to do the wrong thing. If we cannot figure out if a device matches (because it is on btrfs or another fs where our implementation is deficient), we should emit a debug level warning and assume that the device matches.

Partially quoting @diabonas:

in what kind of situation would resume and resume_offset be configured, but SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK should be unset [to avoid false positive because the code cannot deal with some fs types]?

I think that the answer is "never". I.e. the variable makes sense to skip size-of-memory-check, but we should not require it for skipping unsupported filesystems.

@@ -373,13 +403,18 @@ int find_hibernate_location(HibernateLocation **ret_hibernate_location) {
if (r < 0)
return r;

dev_t swap_device;
r = parse_dev(swap_device_id, &swap_device);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First a conversion from a number to a string, and then back from a string to a number. swap_device_to_major_minor() should just return dev_t.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree with @keszybz. There is no point to ask the user to set an environment variable.
When the user setup kernel parameters it is already explicit enough.

There are a few important points to point out of why this environment variable is inconvenient.

  • Present: It will break existing setup. The user must figure out the new way and begs the question why the old way just work. It also means that the PR that introduces this bug is incomplete.

  • Future: Eventually Systemd is going to fix this, where the memory check is done properly for swap file on btrfs. This benefit will not propagate to the user since the environment variable is set. Once this environment variable is configured it will stay there. I am not going to monitor the changelog for when I can remove the variable to get the benefit of memory checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First a conversion from a number to a string, and then back from a string to a number. swap_device_to_major_minor() should just return dev_t.

Thanks for this suggestion. This was a good catch.

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from df70f92 to 9a5e8cc Compare December 21, 2019 01:54
@zachsmith
Copy link
Contributor Author

No, sorry, I think the code is still trying to do the wrong thing. If we cannot figure out if a device matches (because it is on btrfs or another fs where our implementation is deficient), we should emit a debug level warning and assume that the device matches.

Partially quoting @diabonas:

in what kind of situation would resume and resume_offset be configured, but SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK should be unset [to avoid false positive because the code cannot deal with some fs types]?

I think that the answer is "never". I.e. the variable makes sense to skip size-of-memory-check, but we should not require it for skipping unsupported filesystems.

Updated to remove requirement to use SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK for Btrfs swap files.

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 9a5e8cc to e9b6fbd Compare December 21, 2019 02:24
@zachsmith
Copy link
Contributor Author

zachsmith commented Dec 21, 2019

Thanks for all the constructive feedback @diabonas, @lkjell, @helloworld1, @keszybz, and @poettering.

I've pushed changes to remove the requirement to set SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK when using a swap file on Btrfs - you all made good points on this.

For those of you using Btrfs swapfile, any testing with your setups would be appreciated.

@zachsmith zachsmith changed the title systemd-sleep: remove matching swap requirement for resume= systemd-sleep: always attempt hibernation if configured Dec 21, 2019
@diabonas
Copy link
Contributor

Great, thank you for all your work! The updated PR works out of the box (i.e. without having to set SYSTEMD_BYPASS_HIBERNATION_MEMORY_CHECK) on my system with a swap file on a LUKS-encrypted Btrfs partition.

@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from e9b6fbd to 8c8502f Compare December 21, 2019 17:12
src/shared/sleep-config.c Show resolved Hide resolved
if (r < 0)
return log_debug_errno(r, "Error reading /sys/power/resume: %m");

r = parse_dev(resume_str, &resume);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spurious space before "="

/*
* Determine if the HibernateLocation matches the resume= (device) and resume_offset= (file).
*/
static bool location_is_resume_device(const HibernateLocation *location, const dev_t sys_resume, const uint64_t sys_offset) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop the const from non-pointer parameters, that has no effect... (i.e. leave the one for location, drop the one from sys_resume and sys_offset)

bool resume_match = location_is_resume_device(hibernate_location, sys_resume, sys_offset);

/* resume= is set, but a matching /proc/swaps entry was not found */
if (!resume_match && sys_resume > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0? do we know dev_t is not signed? should be != 0, no?

src/shared/sleep-config.h Outdated Show resolved Hide resolved
src/shared/sleep-config.h Outdated Show resolved Hide resolved
@poettering poettering added the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Jan 3, 2020
When calculation of swap file offset is unsupported, rely on the
/sys/power/resume & /sys/power/resume_offset values if configured
rather than requiring a matching swap entry to be identified.

Refactor to use dev_t for comparison of resume= device instead of string.
@zachsmith zachsmith force-pushed the systemd-sleep-fix-btrs-hibernate-bug branch from 8c8502f to 5213327 Compare January 6, 2020 04:15
@keszybz keszybz removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Jan 7, 2020
if (asprintf(&major_minor, "%u:%u", major(swap_dev), minor(swap_dev)) < 0)
return log_oom();
if (streq(swap->type, "partition")) {
if(!S_ISBLK(sb.st_mode))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after if.


/* resume= is set, but a matching /proc/swaps entry was not found */
if (!resume_match && sys_resume != 0) {
log_debug("/sys/power/resume appears to be configured but a matching swap in /proc/swaps could not be identified; hibernation may fail");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not OK. We can distinguish the cases where:

  • there is no swap in /proc/swaps
  • there is a device in /proc/swaps but a different device is configured in /sys/power/resume
  • /sys/power/resume is populated and there's a matching device in /proc/swaps but we cannot verify the offset.

With your patch, after swapoff -a I get:

/sys/power/resume appears to be configured but a matching swap in /proc/swaps could not be identified; hibernation may fail
Unable to determine remaining swap space; hibernation may fail
Hibernation configured and possible: yes

But hibernation will surely fail, so saying that it "might fail" is wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #14511. I think it DTRT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keszybz - sorry you had to go make some final changes :-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zachsmith thank you for working on this. It's not an easy piece of code, as all the back-and-forth shows.

@keszybz keszybz mentioned this pull request Jan 7, 2020
@poettering poettering merged commit 5213327 into systemd:master Jan 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
7 participants