Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upInput/Output Errors and PCI devices unavailable after suspend #3049
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Aug 25, 2017
Member
This sounds like it might be a duplicate of #3008. The workaround is to blacklist iwlmvm. (See issue comments for details.)
|
This sounds like it might be a duplicate of #3008. The workaround is to blacklist |
andrewdavidwong
closed this
Aug 25, 2017
andrewdavidwong
added
the
duplicate
label
Aug 25, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Aug 25, 2017
Member
(If it turns out not to be a duplicate, let me know, and we can reopen this.)
|
(If it turns out not to be a duplicate, let me know, and we can reopen this.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 25, 2017
I wish it were. That is in the troubleshooting steps recommended in the wireless-troubleshooting doc. Unfortunately adding iwlmvm to /rw/config/suspend-module-blacklist made made no difference for me in testing.
danjjeff
commented
Aug 25, 2017
|
I wish it were. That is in the troubleshooting steps recommended in the wireless-troubleshooting doc. Unfortunately adding iwlmvm to /rw/config/suspend-module-blacklist made made no difference for me in testing. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 25, 2017
I'm going to test downgrading to 3.2 and rolling back the kernel to see if that corrects it.
danjjeff
commented
Aug 25, 2017
|
I'm going to test downgrading to 3.2 and rolling back the kernel to see if that corrects it. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
z4ppy
Aug 25, 2017
z4ppy
commented
Aug 25, 2017
|
Hello,
I had the same issue on 3.2, it seems resolved since the last update for me
(4.9.35-19.pvops.qubes.x86_64)
2017-08-24 20:29 GMT+02:00 Daniel Jeffery <notifications@github.com>:
… Qubes OS version (e.g., R3.2):
3.2 and 4.0rc1
Affected TemplateVMs (e.g., fedora-23, if applicable):
dom0, sys-net
------------------------------
Expected behavior:
Qubes can be suspended and recover from suspend
Actual behavior:
After suspend Qubes is unstable.
Behavior is inconsistent. Sometimes networking is just disabled and nmcli
in the sys-net system VM reports that the ethernet and wireless devices are
unavailable and system is otherwise fine. At other times the sys-net VM is
unresponsive or shuts down completely and dom0 gives input/output errors
when attempting to open terminals or shutdown. The errors from dom0 are
also bizarre as they affect different command from one test to the next.
Sometimes lspci will throw the error, other times dmesg, ls or less will
throw an error and lspci is fine. Often initctl will give the input/output
error and the system must be restarted.
Steps to reproduce the behavior:
Suspend qubes (close lid, use menu and echo mem > /etc/sys/suspend all
produce the same result)
Awaken (lift lid, push power button, it ignores keystrokes on my laptop)
General notes:
Hardware is a Lenovo X1 Carbon gen3, wireless adapter is Intel 7265 rev
59. I've run full system diagnostics on the laptop and it passes. I've
tested different nvme drives with no benefit.
I've tried the steps from #2922
<#2922> and
https://www.qubes-os.org/doc/wireless-troubleshooting/#
automatically-reloading-drivers-on-suspendresume, but they have not
helped. Related issues:
https://groups.google.com/forum/#!topic/qubes-users/LkP-6ORGwME
#2922 <#2922>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3049>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABXMtA7GjaRAoZjCez8_qYX6LsLgTbsZks5sbcEjgaJpZM4PBwtf>
.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
commented
Aug 25, 2017
|
@z4ppy, unfortunately I didn't have it until I updated to 4.9.35-19 :( |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 25, 2017
@danjeff: Yes, you need to restart sys-net once you've modified the blacklist file. You also need to make sure to blacklist both iwlmvm and iwlwifi in the file, not just iwlmvm.
rtiangha
commented
Aug 25, 2017
|
@danjeff: Yes, you need to restart sys-net once you've modified the blacklist file. You also need to make sure to blacklist both iwlmvm and iwlwifi in the file, not just iwlmvm. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 25, 2017
I blacklisted both iwlmvm and iwlwifi. I restarted Qubes entirely. The problem persisted. Just to be sure I am not remembering incorrectly or making some mistake, I will try it again with the latest kernel.
I have reinstalled 3.2 fresh with it running 4.4.14-11 and there is no problem. Suspend works fine and I do not get the odd lockups or input/output errors. I'll update now to latest 4.9.35-19 and fedora 25 for the template and put in the blacklist lines.
danjjeff
commented
Aug 25, 2017
•
|
I blacklisted both iwlmvm and iwlwifi. I restarted Qubes entirely. The problem persisted. Just to be sure I am not remembering incorrectly or making some mistake, I will try it again with the latest kernel. I have reinstalled 3.2 fresh with it running 4.4.14-11 and there is no problem. Suspend works fine and I do not get the odd lockups or input/output errors. I'll update now to latest 4.9.35-19 and fedora 25 for the template and put in the blacklist lines. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 25, 2017
It looks like I spoke too soon. The sys-net VM had not crashed after the suspend on the older kernel and NetworkManager reported the wifi connection was still up, but it wasn't passing any traffic and once I downed the connection nmcli couldn't bring it up again.
I've blacklisted iwlwifi and iwlmvm in sys-net:/rw/config/suspend-module-blacklist and restarted the sys-net VM. The behavior from the previous paragraph was exactly repeated.
A key difference worth noting in the behavior on the older kernel is that Network Manager still thinks the connection is active and the device is connected. Also, I don't seem to be getting the bizarre behavior in dom0. Also, restarting the sys-net VM is possible and everything works again afterward.
I'm going to proceed to update to fedora-25 and the newer kernel.
danjjeff
commented
Aug 25, 2017
•
|
It looks like I spoke too soon. The sys-net VM had not crashed after the suspend on the older kernel and NetworkManager reported the wifi connection was still up, but it wasn't passing any traffic and once I downed the connection nmcli couldn't bring it up again. I've blacklisted iwlwifi and iwlmvm in sys-net:/rw/config/suspend-module-blacklist and restarted the sys-net VM. The behavior from the previous paragraph was exactly repeated. A key difference worth noting in the behavior on the older kernel is that Network Manager still thinks the connection is active and the device is connected. Also, I don't seem to be getting the bizarre behavior in dom0. Also, restarting the sys-net VM is possible and everything works again afterward. I'm going to proceed to update to fedora-25 and the newer kernel. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 25, 2017
Okay, on 3.2 with kernel 4.9.35-19 and the fedora 25 template, I am currently seeing the same suspend behavior as on 4.4.14-11 with fedora 23. sys-net:/rw/config/suspend-module-blacklist contains iwlmvm and iwlwifi, each on their own line.
Since I don't seem to be getting the input/output errors anymore (for no reason?) on 3.2 I guess I'll stay here for now and just not suspend. I am very open to other ideas or troubleshooting.
danjjeff
commented
Aug 25, 2017
|
Okay, on 3.2 with kernel 4.9.35-19 and the fedora 25 template, I am currently seeing the same suspend behavior as on 4.4.14-11 with fedora 23. sys-net:/rw/config/suspend-module-blacklist contains iwlmvm and iwlwifi, each on their own line. Since I don't seem to be getting the input/output errors anymore (for no reason?) on 3.2 I guess I'll stay here for now and just not suspend. I am very open to other ideas or troubleshooting. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 25, 2017
Hooked back up my USB mouse and found the sys-net VM has the same problems. After suspend, USB is also broken until the VM is restarted.
And now the input/output errors are back in dom0 and I can't stop and restart the VMs :)
I'm not sure if that is the result of just running it long enough or because I restarted the VMs and suspended a second time without a reboot, but they're back. I'm really wondering if this is a hardware failure at this point, but all the Lenovo system diagnostics come back fine.
danjjeff
commented
Aug 25, 2017
|
Hooked back up my USB mouse and found the sys-net VM has the same problems. After suspend, USB is also broken until the VM is restarted. And now the input/output errors are back in dom0 and I can't stop and restart the VMs :) I'm not sure if that is the result of just running it long enough or because I restarted the VMs and suspended a second time without a reboot, but they're back. I'm really wondering if this is a hardware failure at this point, but all the Lenovo system diagnostics come back fine. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 26, 2017
I don't know; these symptoms are weird. I have an Intel 7260 dual ac card and it seems to work fine, although it's not a 7265 but one would think it was close enough. But it's also on a Dell L502X.
I noticed in your log output on the mail list that it couldn't load the wifi firmware. Just to double check, but is it actually installed (I assume it is, but you never know)?
sudo dnf install iwl7260-firmware or sudo dnf install linux-firmware (I'm not sure which; I'm a Debian guy)
Also, check the Lenovo website for any BIOS updates and if they exist, try applying them. Maybe this is a known hardware issue that's already been fixed and there are a few cases out there with similar symptoms on other distros and they all seem to come from Lenovo users so maybe this is something the manufacturer has already addressed in a BIOS update.
I'd also go into your BIOS and double check any ACPI, Power Management, and Virtualization settings and ensure that they are all enabled properly.
rtiangha
commented
Aug 26, 2017
•
|
I don't know; these symptoms are weird. I have an Intel 7260 dual ac card and it seems to work fine, although it's not a 7265 but one would think it was close enough. But it's also on a Dell L502X. I noticed in your log output on the mail list that it couldn't load the wifi firmware. Just to double check, but is it actually installed (I assume it is, but you never know)? sudo dnf install iwl7260-firmware or sudo dnf install linux-firmware (I'm not sure which; I'm a Debian guy) Also, check the Lenovo website for any BIOS updates and if they exist, try applying them. Maybe this is a known hardware issue that's already been fixed and there are a few cases out there with similar symptoms on other distros and they all seem to come from Lenovo users so maybe this is something the manufacturer has already addressed in a BIOS update. I'd also go into your BIOS and double check any ACPI, Power Management, and Virtualization settings and ensure that they are all enabled properly. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 26, 2017
It is weird. BIOS is a good point and I had upgraded the BIOS to latest right at the beginning of troubleshooting. The iwl7260 firmware is installed correctly by default. I don't know that the wireless or the USB layers are the right place to look at this point, though. This seems to be an issue with PCI passthrough after suspend or some other events that happen with time since I've had the issue start just after the machine has been running for a while.
The most frustrating part of this is the inconsistency. I have not had this problem since updating to 3.2 as soon as it released last year, but now it's present even on fresh 3.2 install. The problem seems to have started about 2 weeks ago.
I was able to get a hold of another identical (checked all the hardware chips and revisions) gen 3 X1 Carbon and compare against 3 more identical machines running Qubes, but not in my possession. The two in my possession that I have wiped and tested on Qubes 4, 3.2 as-installed and 3.2 up-to-date all exhibit the same behavior. Two of the other 3 seem to also exhibit the suspend behavior, but not the freezes while the 3rd is reported to be fine and is fully patched like the others.
I went ahead and booted up a live disk of Kali and there seem to be no issues. Suspend works fine and there are no unusual freezes or input/output errors. I'm at a bit of loss where to even look next, but at this point I can only use Qubes if I don't let it suspend and even then, it has repeatedly locked up on me and lost work in progress.
If it will help, I am perfectly willing to overnight or 2-day one of these laptops to a dev to help sort this out.
danjjeff
commented
Aug 26, 2017
|
It is weird. BIOS is a good point and I had upgraded the BIOS to latest right at the beginning of troubleshooting. The iwl7260 firmware is installed correctly by default. I don't know that the wireless or the USB layers are the right place to look at this point, though. This seems to be an issue with PCI passthrough after suspend or some other events that happen with time since I've had the issue start just after the machine has been running for a while. The most frustrating part of this is the inconsistency. I have not had this problem since updating to 3.2 as soon as it released last year, but now it's present even on fresh 3.2 install. The problem seems to have started about 2 weeks ago. I was able to get a hold of another identical (checked all the hardware chips and revisions) gen 3 X1 Carbon and compare against 3 more identical machines running Qubes, but not in my possession. The two in my possession that I have wiped and tested on Qubes 4, 3.2 as-installed and 3.2 up-to-date all exhibit the same behavior. Two of the other 3 seem to also exhibit the suspend behavior, but not the freezes while the 3rd is reported to be fine and is fully patched like the others. I went ahead and booted up a live disk of Kali and there seem to be no issues. Suspend works fine and there are no unusual freezes or input/output errors. I'm at a bit of loss where to even look next, but at this point I can only use Qubes if I don't let it suspend and even then, it has repeatedly locked up on me and lost work in progress. If it will help, I am perfectly willing to overnight or 2-day one of these laptops to a dev to help sort this out. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 26, 2017
As a last resort, maybe try one of @fepitre's 4.12 kernels that he posted to the mail list to see if it helps? The kernel options shouldn't be much different than 4.9's except for the new drivers introduced, but maybe the power management stuff is better. It seems it's kind of flakey in 4.9, especially when it comes to Intel wifi; having Intel power management disabled by default in the kernel makes suspend work for some cards and not others, and enabling it in the kernel flips it around (currently, it's disabled in-kernel because having it enabled was causing too many issues, but there's a sysctl or kernel value you can toggle to enable it yourself, but I don't know it off the top of my head).
rtiangha
commented
Aug 26, 2017
•
|
As a last resort, maybe try one of @fepitre's 4.12 kernels that he posted to the mail list to see if it helps? The kernel options shouldn't be much different than 4.9's except for the new drivers introduced, but maybe the power management stuff is better. It seems it's kind of flakey in 4.9, especially when it comes to Intel wifi; having Intel power management disabled by default in the kernel makes suspend work for some cards and not others, and enabling it in the kernel flips it around (currently, it's disabled in-kernel because having it enabled was causing too many issues, but there's a sysctl or kernel value you can toggle to enable it yourself, but I don't know it off the top of my head). |
andrewdavidwong
added
bug
and removed
duplicate
labels
Aug 27, 2017
andrewdavidwong
added this to the Release 3.2 updates milestone
Aug 27, 2017
andrewdavidwong
reopened this
Aug 27, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
Well, I thought maybe I'd gotten somewhere over the weekend as I reinstalled 3.2 again and ... everything worked. I suspended and restarted several times and everything seemed fine, so I crossed my fingers and updated the kernel and the template VM, but then it went back to locked up PCI devices and input/output errors. I tried setting everything to use the old kernel, 4.4.14-11, but the problems persisted. I tried Reg's suggestion of using the 4.12 kernels and still had the same problems.
My conclusion at this point is that while the kernel may be involved, it's not just the kernel that's the problem. I'm going to try reinstalling 3.2 fresh, again, and see if I can get the state where everything works. I have done a fresh install of 3.2 about 5 times in the last several days and only had it work correctly that once, so I'm not optimistic it will work, but I am at a loss as to why it would be different. BIOS settings are the same, install parameters are the same.
One issue I don't think I've mentioned is that when this was working correctly, the system can shutdown and restart successfully. When it's in the bad configuration it will hang on shutdown. In the bad configuration before anything appears to be broken (network and usb still working fine, no input/output errors) if I attempt to shutdown or restart I can watch it attempting 8 different background processes, all of which appear to be dismounts until it hits the 1m30s limit and then hang with a series of errors like device-mapper: remove ioctl on [device] failed: Device or resource busy. At this point I'm forced to hard power off by holding down the power button.
Once I've done a suspend or waited long enough for input/output errors and/or net and usb errors, when I attempt to restart I get blk_update_request: I/O error, dev sda sector [some number that changes each time]. The second error made me wonder about bad nvme, but swapping it didn't change anything.
danjjeff
commented
Aug 28, 2017
|
Well, I thought maybe I'd gotten somewhere over the weekend as I reinstalled 3.2 again and ... everything worked. I suspended and restarted several times and everything seemed fine, so I crossed my fingers and updated the kernel and the template VM, but then it went back to locked up PCI devices and input/output errors. I tried setting everything to use the old kernel, 4.4.14-11, but the problems persisted. I tried Reg's suggestion of using the 4.12 kernels and still had the same problems. My conclusion at this point is that while the kernel may be involved, it's not just the kernel that's the problem. I'm going to try reinstalling 3.2 fresh, again, and see if I can get the state where everything works. I have done a fresh install of 3.2 about 5 times in the last several days and only had it work correctly that once, so I'm not optimistic it will work, but I am at a loss as to why it would be different. BIOS settings are the same, install parameters are the same. One issue I don't think I've mentioned is that when this was working correctly, the system can shutdown and restart successfully. When it's in the bad configuration it will hang on shutdown. In the bad configuration before anything appears to be broken (network and usb still working fine, no input/output errors) if I attempt to shutdown or restart I can watch it attempting 8 different background processes, all of which appear to be dismounts until it hits the 1m30s limit and then hang with a series of errors like Once I've done a suspend or waited long enough for input/output errors and/or net and usb errors, when I attempt to restart I get |
danjjeff
changed the title from
Input/Output Errors and network devices unavailable on suspend
to
Input/Output Errors and PCI devices unavailable after suspend
Aug 28, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 28, 2017
So just to confirm, installing fresh from ISO works, updating system afterwards doesn't, and switching to an older kernel from that point still doesn't?
If you're going to re-install from scratch, can you a) capture dmesg output from both dom0 and sys-net and/or attach system logs when it's working, then just update kernel, kernel-qubes-vm (and kernel-devel if you have it) in dom0 (sudo qubes-dom0-update kernel kernel-qubes-vm) and try it again with the new kernel and report back if it still works or not (capture dmesg if it doesn't)? And if it does work, update dom0 and sys-net's template with the regular system updates and try again?
rtiangha
commented
Aug 28, 2017
•
|
So just to confirm, installing fresh from ISO works, updating system afterwards doesn't, and switching to an older kernel from that point still doesn't? If you're going to re-install from scratch, can you a) capture dmesg output from both dom0 and sys-net and/or attach system logs when it's working, then just update kernel, kernel-qubes-vm (and kernel-devel if you have it) in dom0 (sudo qubes-dom0-update kernel kernel-qubes-vm) and try it again with the new kernel and report back if it still works or not (capture dmesg if it doesn't)? And if it does work, update dom0 and sys-net's template with the regular system updates and try again? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 28, 2017
Also, at each step, verify the running kernel in both dom0 and sys-net by running uname -r as a sanity check.
rtiangha
commented
Aug 28, 2017
|
Also, at each step, verify the running kernel in both dom0 and sys-net by running |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
I changed the title to better reflect this does not appear to be primarily about the network. Both USB and Network VMs lose their devices and dom0 is having issues even running dmesg and sometimes lspci or reading logs.
@rtiangha It doesn't always work fresh from the ISO. I reinstalled 3.2 at least 5 times over the last week and only once did it work correctly. It's reinstalling right now. If it works, I'll collect the logs, update just the kernel and kernel-qubes-vm packages on dom0 and see what that gets us. As noted, I probably can't capture dmesg when it's not working as that command throws the input/output error nearly all the time once we're in the bad state (as well as trying to less/vi/grep anything in /var/log). I've been uname'ing for exactly that reason all along the way. :) Thanks for your help.
danjjeff
commented
Aug 28, 2017
|
I changed the title to better reflect this does not appear to be primarily about the network. Both USB and Network VMs lose their devices and dom0 is having issues even running dmesg and sometimes lspci or reading logs. @rtiangha It doesn't always work fresh from the ISO. I reinstalled 3.2 at least 5 times over the last week and only once did it work correctly. It's reinstalling right now. If it works, I'll collect the logs, update just the kernel and kernel-qubes-vm packages on dom0 and see what that gets us. As noted, I probably can't capture dmesg when it's not working as that command throws the input/output error nearly all the time once we're in the bad state (as well as trying to less/vi/grep anything in /var/log). I've been uname'ing for exactly that reason all along the way. :) Thanks for your help. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 28, 2017
Cool. Keep the post updated. Full logs where possible would be helpful to at least see what's going on. Personally, I've never seen this behaviour ever.
Also, am I correct in thinking that you've got sys-net acting as a combined USBvm as well, or do you have a separate sys-usb VM?
rtiangha
commented
Aug 28, 2017
|
Cool. Keep the post updated. Full logs where possible would be helpful to at least see what's going on. Personally, I've never seen this behaviour ever. Also, am I correct in thinking that you've got sys-net acting as a combined USBvm as well, or do you have a separate sys-usb VM? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
For all the tests over the last week I've had separate sys-usb and sys-net VMs. Are there any logs other than dmesg you'd like me to capture?
danjjeff
commented
Aug 28, 2017
|
For all the tests over the last week I've had separate sys-usb and sys-net VMs. Are there any logs other than dmesg you'd like me to capture? |
danjjeff
closed this
Aug 28, 2017
danjjeff
reopened this
Aug 28, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
The thing driving me nuts about this behavior is how inconsistent it's being. I'm trying to hold my install and config parameters totally consistent and think of anything I or the hardware are doing that could give different results, but I'm at a bit of a loss. One difference I just thought of between the two machines I'm testing with and the other three also running Qubes is the BIOS revision. These two are fully patched and I'm not sure if the other three have ever been patched from what the factory shipped.
danjjeff
commented
Aug 28, 2017
|
The thing driving me nuts about this behavior is how inconsistent it's being. I'm trying to hold my install and config parameters totally consistent and think of anything I or the hardware are doing that could give different results, but I'm at a bit of a loss. One difference I just thought of between the two machines I'm testing with and the other three also running Qubes is the BIOS revision. These two are fully patched and I'm not sure if the other three have ever been patched from what the factory shipped. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 28, 2017
Any kind of system logs would be useful. Do it in dom0, sys-net, and sys-usb. We're still in fact-finding mode.
Also, what kind of BIOS options does the machine have when it comes to Virtualization? On the surface, this sounds like something buggy with VT-d or maybe IOMMU.
Finally, you say you have a set of 2 machines with an updated BIOS and a set of 3 that doesn't? When this stuff works, on what set of machines does it actually work on? And what versions are they running?
rtiangha
commented
Aug 28, 2017
•
|
Any kind of system logs would be useful. Do it in dom0, sys-net, and sys-usb. We're still in fact-finding mode. Also, what kind of BIOS options does the machine have when it comes to Virtualization? On the surface, this sounds like something buggy with VT-d or maybe IOMMU. Finally, you say you have a set of 2 machines with an updated BIOS and a set of 3 that doesn't? When this stuff works, on what set of machines does it actually work on? And what versions are they running? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
In BIOS there is an option for Intel (R) Virtualization Technology and Intel (R) VT-d Feature. Both are enabled.
Of the 3 other machines, 1 seems to be fine and having no issues. One seems to be exhibiting the same issues as the two in my possession on suspend, but doesn't seem to develop problems simply by being on for some period of time. The third is having occasionally freezes that require a hard power off (this is the behavior I was seeing before the network lockups and input/output errors started), but the third machine seems fine recovering from the suspend. I've confirmed that all three of those machines are running the same BIOS revision.
danjjeff
commented
Aug 28, 2017
|
In BIOS there is an option for Intel (R) Virtualization Technology and Intel (R) VT-d Feature. Both are enabled. Of the 3 other machines, 1 seems to be fine and having no issues. One seems to be exhibiting the same issues as the two in my possession on suspend, but doesn't seem to develop problems simply by being on for some period of time. The third is having occasionally freezes that require a hard power off (this is the behavior I was seeing before the network lockups and input/output errors started), but the third machine seems fine recovering from the suspend. I've confirmed that all three of those machines are running the same BIOS revision. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
I'm having the problem immediately on the fresh install. I did the exact same fresh install twice on Friday. The first time was like this, with the problem behavior occurring immediately, the second time it was working fine until I updated the kernel. Today's fresh install is having the problems immediately.
Fortunately, this time it let me run dmesg while sys-net and sys-usb were unable to reach the hardware dedicated to them. I've attached the tarred up dmesg files.
danjjeff
commented
Aug 28, 2017
|
I'm having the problem immediately on the fresh install. I did the exact same fresh install twice on Friday. The first time was like this, with the problem behavior occurring immediately, the second time it was working fine until I updated the kernel. Today's fresh install is having the problems immediately. Fortunately, this time it let me run dmesg while sys-net and sys-usb were unable to reach the hardware dedicated to them. I've attached the tarred up dmesg files. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 28, 2017
Reinstalled 3.2 two more times on the same hardware and it has had the broken suspend right out of the install both times.
danjjeff
commented
Aug 28, 2017
|
Reinstalled 3.2 two more times on the same hardware and it has had the broken suspend right out of the install both times. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 28, 2017
So are you saying that 1 machine out of the 5 works no matter what you do to it, and the others don't? What BIOS version are all of these running?
rtiangha
commented
Aug 28, 2017
|
So are you saying that 1 machine out of the 5 works no matter what you do to it, and the others don't? What BIOS version are all of these running? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 29, 2017
/sigh
I was attempting to just use one of the two systems I've been reinstalling and not suspend. So, I wrote a full response in firefox on the personal VM, then Qubes froze and I lost it. So, starting again.
All are identical gen 3 Lenovo X1 Carbons purchased as a single batch and running Qubes since late 2015.
In use by others:
1 of 5: BIOS 1.10, identical BIOS settings to my broken boxes. Qubes 3.2. Fedora 25 template VMs. No issues.
2 of 5: BIOS 1.10, BIOS settings not verified. Qubes 3.2. Fedora 25 template VMs. net/usb pci devices unavailable in VMs after a suspend (usually, works fine ~10% of the time).
3 of 5: BIOS 1.10, BIOS settings not verified. Qubes 3.2. Fedora 25 template VMs. Occasional OS freezes (every 3-4 days?). No suspend issues.
In my possession:
4 of 5: BIOS 1.17, identical BIOS settings to 1 of 5. Qubes 3.2 (also tested 4). Fedora 23 or 25 template VMs. PCI devices unavailable in VMs after suspend. If actually running firefox in one VM, it freezes after about 30 minutes. Reinstalled 4 twice and 3.2 3 times in the last week.
5 of 5: BIOS 1.17, identical BIOS settings to 1 of 5. Qubes 3.2 (also tested 4). Fedora 23 or 25 template VMs. PCI devices unavailable in VMs after suspend. If actually running firefox in one VM, it freezes after about 30 minutes. Installed and tested 4, reinstalled fresh 3.2 5 times in the last 4 days, worked fine on one install until kernel was updated.
Looking at this, I'll try rolling the BIOS back to 1.10. The problems 2 and 3 are having are not as consistent or comprehensive as the problems 4 and 5 are having. I know there are some rollback prevention BIOS settings I'll probably need to play with.
danjjeff
commented
Aug 29, 2017
|
/sigh I was attempting to just use one of the two systems I've been reinstalling and not suspend. So, I wrote a full response in firefox on the personal VM, then Qubes froze and I lost it. So, starting again. All are identical gen 3 Lenovo X1 Carbons purchased as a single batch and running Qubes since late 2015. In use by others: In my possession: Looking at this, I'll try rolling the BIOS back to 1.10. The problems 2 and 3 are having are not as consistent or comprehensive as the problems 4 and 5 are having. I know there are some rollback prevention BIOS settings I'll probably need to play with. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 29, 2017
Or rather than rolling back to 1.10, figure out what the BIOS settings of Number 3 is and copy it to the machines that aren't working. If Number 3 is the one that's working most of the time, you need to figure out what makes it different.
rtiangha
commented
Aug 29, 2017
|
Or rather than rolling back to 1.10, figure out what the BIOS settings of Number 3 is and copy it to the machines that aren't working. If Number 3 is the one that's working most of the time, you need to figure out what makes it different. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danjjeff
Aug 29, 2017
1 of 5 is the one with no issues. 3 has freezing issues fairly regularly. 4 and 5 have their BIOS settings set identical to 1's.
danjjeff
commented
Aug 29, 2017
|
1 of 5 is the one with no issues. 3 has freezing issues fairly regularly. 4 and 5 have their BIOS settings set identical to 1's. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
Aug 29, 2017
But your message said "3 of 5: BIOS 1.10, BIOS settings not verified. Qubes 3.2. Fedora 25 template VMs. Occasional OS freezes (every 3-4 days?). No suspend issues."???
rtiangha
commented
Aug 29, 2017
|
But your message said "3 of 5: BIOS 1.10, BIOS settings not verified. Qubes 3.2. Fedora 25 template VMs. Occasional OS freezes (every 3-4 days?). No suspend issues."??? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtiangha
commented
Aug 29, 2017
|
Edit: Oops, my bad. I misread. |
qubesos-bot
added
r4.0-jessie-stable
and removed
r4.0-jessie-cur-test
labels
Feb 6, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Feb 6, 2018
Automated announcement from builder-github
The package qubes-core-agent_4.0.20-1+deb9u1 has been pushed to the r4.0 stable repository for the Debian template.
To install this update, please use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade
qubesos-bot
commented
Feb 6, 2018
|
Automated announcement from builder-github The package
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Feb 6, 2018
Automated announcement from builder-github
The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-4.0.20-1.fc26) has been pushed to the r4.0 stable repository for the Fedora template.
To install this update, please use the standard update command:
sudo yum update
qubesos-bot
commented
Feb 6, 2018
|
Automated announcement from builder-github The component
|
qubesos-bot
added
r4.0-fc26-stable
and removed
r4.0-fc26-cur-test
labels
Feb 6, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Feb 6, 2018
Automated announcement from builder-github
The package core-agent-linux has been pushed to the r4.0 stable repository for the Fedora centos7 template.
To install this update, please use the standard update command:
sudo yum update
qubesos-bot
commented
Feb 6, 2018
|
Automated announcement from builder-github The package
|
qubesos-bot
added
r4.0-centos7-stable
and removed
r4.0-centos7-cur-test
labels
Feb 6, 2018
added a commit
to QubesOS/qubes-core-agent-linux
that referenced
this issue
Feb 12, 2018
qubesos-bot
added
the
r3.2-fc23-cur-test
label
Feb 12, 2018
qubesos-bot
referenced this issue
in QubesOS/updates-status
Feb 12, 2018
Closed
core-agent-linux v3.2.23 (r3.2) #407
qubesos-bot
added
r3.2-fc24-cur-test
r3.2-fc25-cur-test
labels
Feb 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Feb 12, 2018
Automated announcement from builder-github
The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-3.2.23-1.fc26) has been pushed to the r3.2 testing repository for the Fedora template.
To test this update, please install it with the following command:
sudo yum update --enablerepo=qubes-vm-r3.2-current-testing
qubesos-bot
commented
Feb 12, 2018
|
Automated announcement from builder-github The component
|
qubesos-bot
added
r3.2-fc26-cur-test
r3.2-jessie-cur-test
labels
Feb 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Feb 12, 2018
Automated announcement from builder-github
The package qubes-core-agent_3.2.23-1+deb9u1 has been pushed to the r3.2 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing stretch-testing (or appropriate equivalent for your template version), then use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade
qubesos-bot
commented
Feb 12, 2018
|
Automated announcement from builder-github The package
|
qubesos-bot
added
the
r3.2-stretch-cur-test
label
Feb 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Feb 12, 2018
Automated announcement from builder-github
The package qubes-core-agent_3.2.23-1+deb10u1 has been pushed to the r3.2 testing repository for the Debian template.
To test this update, first enable the testing repository in /etc/apt/sources.list.d/qubes-*.list by uncommenting the line containing buster-testing (or appropriate equivalent for your template version), then use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade
qubesos-bot
commented
Feb 12, 2018
|
Automated announcement from builder-github The package
|
qubesos-bot
added
the
r3.2-buster-cur-test
label
Feb 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Mar 12, 2018
Automated announcement from builder-github
The package qubes-core-agent_3.2.25-1+deb10u1 has been pushed to the r3.2 stable repository for the Debian template.
To install this update, please use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade
qubesos-bot
commented
Mar 12, 2018
|
Automated announcement from builder-github The package
|
qubesos-bot
added
r3.2-buster-stable
r3.2-jessie-stable
and removed
r3.2-buster-cur-test
r3.2-jessie-cur-test
labels
Mar 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Mar 12, 2018
Automated announcement from builder-github
The package qubes-core-agent_3.2.25-1+deb9u1 has been pushed to the r3.2 stable repository for the Debian template.
To install this update, please use the standard update command:
sudo apt-get update && sudo apt-get dist-upgrade
qubesos-bot
commented
Mar 12, 2018
|
Automated announcement from builder-github The package
|
qubesos-bot
added
r3.2-stretch-stable
r3.2-fc25-stable
and removed
r3.2-stretch-cur-test
r3.2-fc25-cur-test
labels
Mar 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qubesos-bot
Mar 12, 2018
Automated announcement from builder-github
The component core-agent-linux (including package python2-dnf-plugins-qubes-hooks-3.2.25-1.fc26) has been pushed to the r3.2 stable repository for the Fedora template.
To install this update, please use the standard update command:
sudo yum update
qubesos-bot
commented
Mar 12, 2018
|
Automated announcement from builder-github The component
|
danjjeff commentedAug 24, 2017
•
edited
Edited 1 time
-
danjjeff
edited Aug 24, 2017 (most recent)
Qubes OS version (e.g.,
R3.2):3.2 and 4.0rc1
Affected TemplateVMs (e.g.,
fedora-23, if applicable):dom0, sys-net
Expected behavior:
Qubes can be suspended and recover from suspend
Actual behavior:
After suspend Qubes is unstable.
Behavior is inconsistent. Sometimes networking is just disabled and nmcli in the sys-net system VM reports that the ethernet and wireless devices are unavailable and system is otherwise fine. At other times the sys-net VM is unresponsive or shuts down completely and dom0 gives input/output errors when attempting to open terminals or shutdown. The errors from dom0 are also bizarre as they affect different command from one test to the next. Sometimes lspci will throw the error, other times dmesg, ls or less will throw an error and lspci is fine. Often initctl will give the input/output error and the system must be restarted.
Steps to reproduce the behavior:
Suspend qubes (close lid, use menu and echo mem > /sys/power/state all produce the same result)
Awaken (lift lid, push power button, it ignores keystrokes on my laptop)
General notes:
Hardware is a Lenovo X1 Carbon gen3, wireless adapter is Intel 7265 rev 59. I've run full system diagnostics on the laptop and it passes. I've tested different nvme drives with no benefit.
I've tried the steps from #2922 and https://www.qubes-os.org/doc/wireless-troubleshooting/#automatically-reloading-drivers-on-suspendresume, but they have not helped.
Related issues:
https://groups.google.com/forum/#!topic/qubes-users/LkP-6ORGwME
#2922