New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot suspend/resume on Librem 13 v2 #2922

Open
d2r opened this Issue Jul 19, 2017 · 18 comments

Comments

Projects
None yet
4 participants
@d2r

d2r commented Jul 19, 2017

Qubes OS version (e.g., R3.2):

R3.2

Affected TemplateVMs (e.g., fedora-23, if applicable):

n/a, this is dom0


Expected behavior:

xscreensaver lock screen

Actual behavior:

system frozen, screen blank

Steps to reproduce the behavior:

Boot
Log on
Close lid
wait 5 seconds
Open lid

General notes:

This was tested with dom0 kernel versions 4.4, 4.8, and 4.9 (qubes).
PureOS works fine.
Vanilla Fedora 23 works fine.

journalctl has the following messages (transcribed here, not cut&pasted):

52qubes-pause-vms[3239]: Failed to suspend VM dom0: Dom0 do not have libvirt object
libvirtd[1421]: End of file while reading data: Input/output error

Related issues:

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 19, 2017

Test were also done to suspend while playing media with audio. The audio stops on suspend, but never resumes. Tested with kernel 4.4 and 4.9.

d2r commented Jul 19, 2017

Test were also done to suspend while playing media with audio. The audio stops on suspend, but never resumes. Tested with kernel 4.4 and 4.9.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 19, 2017

Member

Any other messages after that? I have similar one and suspend do work. Check also messages in /var/log/xen/console/guest-sys-net.log. And try to suspend when all VMs are off (including sys-net and sys-usb).

Does it actually suspend, then freeze during resume, or not suspend at all?

Member

marmarek commented Jul 19, 2017

Any other messages after that? I have similar one and suspend do work. Check also messages in /var/log/xen/console/guest-sys-net.log. And try to suspend when all VMs are off (including sys-net and sys-usb).

Does it actually suspend, then freeze during resume, or not suspend at all?

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 19, 2017

Any other messages after that? I have similar one and suspend do work.

There were plenty of other messages. I can test again and post a more complete log. Would any other logs be useful?

And try to suspend when all VMs are off (including sys-net and sys-usb).

Same result when all VMs were shutdown except dom0.

Does it actually suspend, then freeze during resume, or not suspend at all?

I hear the audio stop after I close the lid, and the sound is never heard again even after opening the lid.

That in itself does not indicate whether the laptop froze while trying to suspend, or suspended successfully before freezing when trying to resume.

This is a common enough problem that already other users of this laptop would be interested in a solution. So I am willing to work some more to try and debug it.

d2r commented Jul 19, 2017

Any other messages after that? I have similar one and suspend do work.

There were plenty of other messages. I can test again and post a more complete log. Would any other logs be useful?

And try to suspend when all VMs are off (including sys-net and sys-usb).

Same result when all VMs were shutdown except dom0.

Does it actually suspend, then freeze during resume, or not suspend at all?

I hear the audio stop after I close the lid, and the sound is never heard again even after opening the lid.

That in itself does not indicate whether the laptop froze while trying to suspend, or suspended successfully before freezing when trying to resume.

This is a common enough problem that already other users of this laptop would be interested in a solution. So I am willing to work some more to try and debug it.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 19, 2017

Member

That in itself does not indicate whether the laptop froze while trying to suspend, or suspended successfully before freezing when trying to resume.

Try from the menu, instead of closing the lid, and observe power led.

Member

marmarek commented Jul 19, 2017

That in itself does not indicate whether the laptop froze while trying to suspend, or suspended successfully before freezing when trying to resume.

Try from the menu, instead of closing the lid, and observe power led.

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 19, 2017

Try from the menu, instead of closing the lid, and observe power led.

OK, I will find time to try it and report back. It will be some time before I can, so if there are any other tests that would help let me know.

d2r commented Jul 19, 2017

Try from the menu, instead of closing the lid, and observe power led.

OK, I will find time to try it and report back. It will be some time before I can, so if there are any other tests that would help let me know.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 19, 2017

Member

I can test again and post a more complete log. Would any other logs be useful?

If it reaches the moment where you get kernel messages related to suspend (and those messages actually hit the disk...), it would be useful. Real serial console is useful in such a cases, but I assume you don't have one. I wonder what Purism peoplem used to debug coreboot there.
If it fails also with all VMs powered off, better debug in such state - to not add additional complexity to the picture.

See here for debugging hints: http://elixir.free-electrons.com/linux/latest/source/Documentation/power/s2ram.txt and also other files in that directory.

BTW do you have coreboot there?

Member

marmarek commented Jul 19, 2017

I can test again and post a more complete log. Would any other logs be useful?

If it reaches the moment where you get kernel messages related to suspend (and those messages actually hit the disk...), it would be useful. Real serial console is useful in such a cases, but I assume you don't have one. I wonder what Purism peoplem used to debug coreboot there.
If it fails also with all VMs powered off, better debug in such state - to not add additional complexity to the picture.

See here for debugging hints: http://elixir.free-electrons.com/linux/latest/source/Documentation/power/s2ram.txt and also other files in that directory.

BTW do you have coreboot there?

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 19, 2017

BTW do you have coreboot there?

Yes it is coreboot -> grub -> qubes 3.2.

d2r commented Jul 19, 2017

BTW do you have coreboot there?

Yes it is coreboot -> grub -> qubes 3.2.

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 20, 2017

Try from the menu, instead of closing the lid, and observe power led.

On suspend the screen powers off, then the radio light, then the power light. After a second or two the power light comes on, and after several seconds more the power light "breathes." It breathed like this for several minutes.

When I hit a key on the keyboard, the power light went steady on, and the radio light turned on, but the screen remained blank.

New observations:
The Fn+F10 keyboard backlight adjustment works in the above state. If I type other keys (about 6th key press), then the keyboard backlight stops working.

Following echo 1 > /sys/power/pm_trace and suspend, I see no kernel messages written after I trigger suspend. I quickly hard-rebooted after the resume fail in order to preserve the magic number in the RTC (This has the effect of setting the system clock to a date far into the future.)

   Magic number: 0:605:178
   hash matches /home/user/rpmbuild/BUILD/kernel-4.9.35/linux-4.9.35/drivers/base/power/main.c:1070
 acpi device:0e: hash matches
  platform hash matches

Looking for the device that matches:

$ find /sys/devices -iname '*0e*'
/sys/devices/pci0000:00/0000:00:1f.3/hdaudioC0D0/widgets/0e
/sys/devices/platform/PNP0C0E:00
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:0e
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00

One match is on the 3rd line above: device:0e. Under this directory I find power-related things, that seems to be what I would expect if it were working.

The other match was the 'platform' device, and there are a lot of things under there. It seems like a false positive.

So I am not sure what is causing the faulty resume.

d2r commented Jul 20, 2017

Try from the menu, instead of closing the lid, and observe power led.

On suspend the screen powers off, then the radio light, then the power light. After a second or two the power light comes on, and after several seconds more the power light "breathes." It breathed like this for several minutes.

When I hit a key on the keyboard, the power light went steady on, and the radio light turned on, but the screen remained blank.

New observations:
The Fn+F10 keyboard backlight adjustment works in the above state. If I type other keys (about 6th key press), then the keyboard backlight stops working.

Following echo 1 > /sys/power/pm_trace and suspend, I see no kernel messages written after I trigger suspend. I quickly hard-rebooted after the resume fail in order to preserve the magic number in the RTC (This has the effect of setting the system clock to a date far into the future.)

   Magic number: 0:605:178
   hash matches /home/user/rpmbuild/BUILD/kernel-4.9.35/linux-4.9.35/drivers/base/power/main.c:1070
 acpi device:0e: hash matches
  platform hash matches

Looking for the device that matches:

$ find /sys/devices -iname '*0e*'
/sys/devices/pci0000:00/0000:00:1f.3/hdaudioC0D0/widgets/0e
/sys/devices/platform/PNP0C0E:00
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:0e
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00

One match is on the 3rd line above: device:0e. Under this directory I find power-related things, that seems to be what I would expect if it were working.

The other match was the 'platform' device, and there are a lot of things under there. It seems like a false positive.

So I am not sure what is causing the faulty resume.

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 20, 2017

@marmarek Am I making sense above, or am I on the wrong track?

d2r commented Jul 20, 2017

@marmarek Am I making sense above, or am I on the wrong track?

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 20, 2017

Member

So, it is this line. Looks like just after executing some driver-provided suspend callback. Maybe it is simply the last device before system going to sleep and control never come back to the kernel...
If so, most likely it is Xen issue. I'd try newer Xen version first - see here for Qubes 4.0 test image, it have Xen 4.8 (compared to 4.6 in Qubes 3.2). You don't need to install it, simply start it in rescue mode and repeat the above procedure.

Member

marmarek commented Jul 20, 2017

So, it is this line. Looks like just after executing some driver-provided suspend callback. Maybe it is simply the last device before system going to sleep and control never come back to the kernel...
If so, most likely it is Xen issue. I'd try newer Xen version first - see here for Qubes 4.0 test image, it have Xen 4.8 (compared to 4.6 in Qubes 3.2). You don't need to install it, simply start it in rescue mode and repeat the above procedure.

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 21, 2017

Very helpful. I will try it.

d2r commented Jul 21, 2017

Very helpful. I will try it.

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 22, 2017

The process of using the test image via recovery was clunky, and I am not sure the value set in the RTC is correct, since it takes quite a while to reboot such that I can see the log. Here is what I got:

  Magic number: 0:268:1000
misc memory_bandwidth: hash matches
rtc_cmos 00:02: hash matches

Then I thought I would try actually installing from the image, and then try suspending when booting the installed test release from disk. The result was:

  • Suspend DOES work.
  • Hibernate fails immediately, kicking me out to the screensaver.
    • kernel: traps: xfsm-shutdown-h[SOME_PID] general protection ip:SOMEADDR sp:SOMEOTHERADDR error:0
      kernel:  in libc-2.24.so[ANOTHERADDR+1bc000]
      
    • systemd cored I think, so I am attaching one of them to this issue

core.xfsm-shutdown-h.0.1dc1f0e65ca0442db9c76f430855a8ba.8738.1469476829000000000000.lz4.remove_txt_extension.txt

d2r commented Jul 22, 2017

The process of using the test image via recovery was clunky, and I am not sure the value set in the RTC is correct, since it takes quite a while to reboot such that I can see the log. Here is what I got:

  Magic number: 0:268:1000
misc memory_bandwidth: hash matches
rtc_cmos 00:02: hash matches

Then I thought I would try actually installing from the image, and then try suspending when booting the installed test release from disk. The result was:

  • Suspend DOES work.
  • Hibernate fails immediately, kicking me out to the screensaver.
    • kernel: traps: xfsm-shutdown-h[SOME_PID] general protection ip:SOMEADDR sp:SOMEOTHERADDR error:0
      kernel:  in libc-2.24.so[ANOTHERADDR+1bc000]
      
    • systemd cored I think, so I am attaching one of them to this issue

core.xfsm-shutdown-h.0.1dc1f0e65ca0442db9c76f430855a8ba.8738.1469476829000000000000.lz4.remove_txt_extension.txt

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 22, 2017

I've been informed that Qubes does not yet support hibernate (#2414) due to lack of support from Xen. So concerning hibernate, this is expected behavior.

Concerning suspend: What can we do to get a supported work-around for Suspend in an official Qubes installation?

d2r commented Jul 22, 2017

I've been informed that Qubes does not yet support hibernate (#2414) due to lack of support from Xen. So concerning hibernate, this is expected behavior.

Concerning suspend: What can we do to get a supported work-around for Suspend in an official Qubes installation?

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 23, 2017

Member

Ok, if suspend do work on Xen 4.8, the easiest thing to do would be ... to wait for final Qubes 4.0 release. The current test image should be quite usable already and the first release candidate should be in a week or two.
We don't have plans for any other major upgrade in Qubes 3.2.
But if you want, you can try compiling Xen 4.8 packages for Qubes 3.2 yourself. In theory it shouldn't be that complex:

  • https://www.qubes-os.org/doc/qubes-builder/
  • Use qubes-os-r3.2.conf instead of qubes-os-master.conf
  • After initial configuration, add BRANCH_vmm_xen = xen-4.8 to builder.conf
  • Execute the build - either the full one according to instructions above, or just components linked with Xen: vmm-xen core-libvirt core-vchan-xen gui-daemon (in this order)
Member

marmarek commented Jul 23, 2017

Ok, if suspend do work on Xen 4.8, the easiest thing to do would be ... to wait for final Qubes 4.0 release. The current test image should be quite usable already and the first release candidate should be in a week or two.
We don't have plans for any other major upgrade in Qubes 3.2.
But if you want, you can try compiling Xen 4.8 packages for Qubes 3.2 yourself. In theory it shouldn't be that complex:

  • https://www.qubes-os.org/doc/qubes-builder/
  • Use qubes-os-r3.2.conf instead of qubes-os-master.conf
  • After initial configuration, add BRANCH_vmm_xen = xen-4.8 to builder.conf
  • Execute the build - either the full one according to instructions above, or just components linked with Xen: vmm-xen core-libvirt core-vchan-xen gui-daemon (in this order)
@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Jul 23, 2017

Thanks @marmarek

The current test image should be quite usable already

  1. Will there be any issue upgrading to the official release once it is out?
  2. Where can I learn how to use 4.0 if I have questions. (i.e., Where did the Qubes VM manager go in 4.0?)

Another finding was that I upgraded dom0 using the current-testing branch (sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing) and after restarting it seemed that suspend/resume worked fine only when I suspended via echo mem > /sys/power/state, but it did not resume successfully after a lid close. The pm_trace Magic Number was identical to the 3.2 test above.

The best option seems to be to use the 4.0 test installation. It is good to know it is possible to build Xen 4.8 packages for Qubes3.2 if I run into more trouble.

d2r commented Jul 23, 2017

Thanks @marmarek

The current test image should be quite usable already

  1. Will there be any issue upgrading to the official release once it is out?
  2. Where can I learn how to use 4.0 if I have questions. (i.e., Where did the Qubes VM manager go in 4.0?)

Another finding was that I upgraded dom0 using the current-testing branch (sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing) and after restarting it seemed that suspend/resume worked fine only when I suspended via echo mem > /sys/power/state, but it did not resume successfully after a lid close. The pm_trace Magic Number was identical to the 3.2 test above.

The best option seems to be to use the 4.0 test installation. It is good to know it is possible to build Xen 4.8 packages for Qubes3.2 if I run into more trouble.

@d2r

This comment has been minimized.

Show comment
Hide comment
@d2r

d2r Aug 5, 2017

@marmarek I downloaded RC1, and suspend/resume works well.

I would close the this issue except that I see it has been added to the Release 3.2 Updates milestone.

d2r commented Aug 5, 2017

@marmarek I downloaded RC1, and suspend/resume works well.

I would close the this issue except that I see it has been added to the Release 3.2 Updates milestone.

@FlashOfJhana

This comment has been minimized.

Show comment
Hide comment
@FlashOfJhana

FlashOfJhana Dec 18, 2017

I get compile errors with libvirt when switching to xen-4.8 on Qubes 3.2. I would like to use the Qubes 4 rc but my laptop from purism has coreboot with no iommu. (Maybe other branches need to be switched as well?)

from build-logs/core-libvirt-dom0-fc23.log:

In file included from libxl/libxl_domain.h:27:0,
from libxl/libxl_domain.c:28:
/usr/include/libxl.h:1283:5: note: expected 'const libxl_asyncop_how * {aka const struct *}' but argument is of type 'libxl_asyncprogress_how * {aka struct *}'
int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
^
libxl/libxl_domain.c:998:15: error: too few arguments to function 'libxl_domain_create_restore'
ret = libxl_domain_create_restore(cfg->ctx, &d_config, &domid,
^
In file included from libxl/libxl_domain.h:27:0,
from libxl/libxl_domain.c:28:
/usr/include/libxl.h:1283:5: note: declared here
int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
^
Makefile:8022: recipe for target 'libxl/libvirt_driver_libxl_impl_la-libxl_domain.lo' failed
make[3]: *** [libxl/libvirt_driver_libxl_impl_la-libxl_domain.lo] Error 1
make[3]: *** Waiting for unfinished jobs....

FlashOfJhana commented Dec 18, 2017

I get compile errors with libvirt when switching to xen-4.8 on Qubes 3.2. I would like to use the Qubes 4 rc but my laptop from purism has coreboot with no iommu. (Maybe other branches need to be switched as well?)

from build-logs/core-libvirt-dom0-fc23.log:

In file included from libxl/libxl_domain.h:27:0,
from libxl/libxl_domain.c:28:
/usr/include/libxl.h:1283:5: note: expected 'const libxl_asyncop_how * {aka const struct *}' but argument is of type 'libxl_asyncprogress_how * {aka struct *}'
int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
^
libxl/libxl_domain.c:998:15: error: too few arguments to function 'libxl_domain_create_restore'
ret = libxl_domain_create_restore(cfg->ctx, &d_config, &domid,
^
In file included from libxl/libxl_domain.h:27:0,
from libxl/libxl_domain.c:28:
/usr/include/libxl.h:1283:5: note: declared here
int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
^
Makefile:8022: recipe for target 'libxl/libvirt_driver_libxl_impl_la-libxl_domain.lo' failed
make[3]: *** [libxl/libvirt_driver_libxl_impl_la-libxl_domain.lo] Error 1
make[3]: *** Waiting for unfinished jobs....

@FlashOfJhana

This comment has been minimized.

Show comment
Hide comment
@FlashOfJhana

FlashOfJhana Dec 18, 2017

@marmarek Looks like maybe need dom0=fc25?

Will try again

edit (still similiar compile errors with fc25)

FlashOfJhana commented Dec 18, 2017

@marmarek Looks like maybe need dom0=fc25?

Will try again

edit (still similiar compile errors with fc25)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment