Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys-usb dies on suspend/un-suspend #4042

Closed
lunarthegrey opened this issue Jun 28, 2018 · 101 comments
Closed

sys-usb dies on suspend/un-suspend #4042

lunarthegrey opened this issue Jun 28, 2018 · 101 comments
Labels
C: power management diagnosed Technical diagnosis has been performed (see issue comments). P: major Priority: major. Between "default" and "critical" in severity. r4.0-dom0-stable r4.1-dom0-stable r4.2-host-stable T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@lunarthegrey
Copy link

lunarthegrey commented Jun 28, 2018

Qubes OS version:

Qubes OS 4.0

Affected component(s):

sys-usb VM


Steps to reproduce the behavior:

Close laptop to put into suspend, re-open later, un-suspend and sys-usb no longer works.

Expected behavior:

Suspend doesn't crash sys-usb and works fine like other VMs.

Actual behavior:

sys-usb stops working and you must kill it with qvm-kill sys-usb, qvm-shutdown does not work.

General notes:

My Qubes is fully patched with all of the latest stable updates (besides security testing). I am running kernel 4.14.41-1 and using the Fedora 28 template, although this also happened with the Fedora 26 template. I am unsure what logs I should be looking at to determine the cause. When the issue happens I have to force kill the VM and start it back up. I cannot get anything to run in sys-usb after a suspend/un-suspend.


Related issues:

Unknown.

@andrewdavidwong andrewdavidwong added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: other labels Jun 29, 2018
@andrewdavidwong andrewdavidwong added this to the Release 4.0 updates milestone Jun 29, 2018
@lunarthegrey
Copy link
Author

It seems like if I suspend/un-suspend for a few minutes sys-usb works fine. But I just suspended for more than 8 hours, opened up my laptop and found that sys-usb is dead. I can confirm that when I suspend for a few minutes that logs are written but when I suspend for longer periods of time nothing is written to the log file /var/log/xen/console/guest-sys-usb.log

Not sure how I can go about debugging this one.

@Kixunil
Copy link

Kixunil commented Jul 1, 2018

I think I've just hit this issue too. Suspended my laptop during night and then failed to attach USB device to appvm.

Update: I attempted to relaunch it and it failed telling me to see the log file.

@mig5
Copy link

mig5 commented Jul 16, 2018

This has affected me ever since upgrading from 3.2 to 4.

Because I use Yubikey auth for dom0 via the sys-usb, I have to shutdown my sys-usb before I suspend, and then let my Yubikey auth script auto-power up the machine to run its script. Otherwise, the script doesn't fire after resume and I can't login \o/

Like @lunarthegrey I found that if I quickly suspend then resume (like, a 30 second suspend), it sometimes works, but for longer periods of suspend (5 minutes or longer - even if plugged into AC power) sys-usb won't come back.

Some months back I recall seeing a lot of kernel module noise in the sys-usb log when it occurs, but didn't note it. Just tried to reproduce, but now my entire sys-usb AppVM seems to crash (the open gnome-terminal I had open isn't clickable, can't xl console to the VM from dom0, etc). The VM is still considered 'Running' from qvm-ls output, but do anything but qvm-kill it.

@lunarthegrey
Copy link
Author

Anyone else experiencing this behavior? CC'ing @marmarek @andrewdavidwong since I am still experiencing this, and a few others are as well.

@andrewdavidwong andrewdavidwong added the P: major Priority: major. Between "default" and "critical" in severity. label Aug 29, 2018
@tquest1
Copy link

tquest1 commented Sep 8, 2018

I'm having exactly the same problem on both Qubes 3.2 & 4.0 - Fedora 28 - kernel 4.14.57-2. It started about 1 month ago. Everything is updated, it had the same problem on the previous 2 kernels at least.
I am on a ThinkPad T450s. I support some other users with the same laptop - we all have exactly the same problem.

@marmarek
Copy link
Member

marmarek commented Sep 8, 2018

I've seen similar problem on T460 (or T470?), but I don't have it right now to debug it there.
When somebody hit this issue, try to collect some more info:

  1. Is sys-usb-dm alive (sudo xl console sys-usb-dm should give interactive shell)?
  2. Collect stack trace of all vcpus of sys-usb: gdbsx -c $(xl domid sys-usb) 64
  3. Exact kernel version running there
  4. Try xl sysrq sys-usb t and check sys-usb's console - probably wont work, but worth a try

@tquest1
Copy link

tquest1 commented Sep 12, 2018

Attached are the results of sudo xl console sys-usb-dm & the stack trace. As per point 4. - the console didn't work. THe sys-usb is a fully up-to-date fedora 28 with kernel: 4.14.57-2 kernel in HVM mode. dom0 is also fully up-to-date as per stable repos. (Qubes 4.0).
sys-usb-dump.txt
sys-usb-vcputrace.txt

@airblag
Copy link

airblag commented Sep 15, 2018

I have the same problem with QubesOS 3.2.
EDIT: I read the ticket too fast, in my case, the sys-usb is not crashing but loosing the usb devices after resume.

When I resume from suspend, I get this message in the console :

[   68.812049] uhci_hcd 0000:00:00.1: host controller process error, something bad happened!
[   68.812069] uhci_hcd 0000:00:00.1: host controller halted, very bad!
[   68.812765] uhci_hcd 0000:00:00.1: HC died; cleaning up

It seems to work, if I blacklist uhci_hcd in sys-usb by editing /rw/config/suspend-module-blacklist (I also blacked listed ehci_hcd and ehci_pci to be sure).

[root@sys-usb ~]# echo "uhci_hcd" >>  /rw/config/suspend-module-blacklist

@tquest1
Copy link

tquest1 commented Oct 3, 2018

I have tried adding various combinations of xhci_pci , xhci_hcd ,uhci_hcd to /rw/config/suspend-module-blacklist with no luck.
Is there any progress on this bug? It's very inconvenient having to restart sys-usb after every suspend. Any further help in debugging? I did attach the requested stack trace above.

@lunarthegrey
Copy link
Author

This doesn't appear to be an issue on kernel 4.14.74-1.pvops.qubes.x86_64 anymore. Anyone else notice this?

@mig5
Copy link

mig5 commented Oct 30, 2018

@lunarthegrey you're right, I just tested suspend/resume and all works fine on 4.14.74-1.pvops.qubes.x86_64. Hope it wasn't a lucky once-off. Thanks for the heads up!

@andrewdavidwong
Copy link
Member

Closing this as "resolved." If you believe the issue is not yet resolved, or if anyone is still affected by this issue, please leave a comment, and we'll be happy to reopen this. Thank you.

@mig5
Copy link

mig5 commented Oct 31, 2018

Hmm, I left my laptop suspended overnight (instead of just for an hour or so) and I know my sys-usb was running at the time of suspend. But it was unreachable on resume this morning (I know because I rely on my yubikey to unlock the screensaver, and it couldn't, and the yubikey's light would not light up). If the sys-usb had been shut down, it would've autostarted and let me in.

So maybe this isn't quite fixed yet :(

@tquest1
Copy link

tquest1 commented Nov 6, 2018

Unfortunately, I still have the problem on 4.14.74-1 too.

@lunarthegrey
Copy link
Author

lunarthegrey commented Nov 6, 2018

Strange... I wonder what the difference we all have is. I am using the fedora-28 template that's fully up-to-date and is running 4.14.74-1 and I don't have the issue anymore after updating. Did many many suspend/resumes and sys-usb didn't die.

@mig5
Copy link

mig5 commented Nov 7, 2018

Mine too is fedora-28 and 4.14.74-1. It worked the first couple times but with only short suspend periods of between 5 and 20 minutes. When it failed it had been suspended for overnight.

I'll try and reproduce again and if it dies (or at least USB interaction dies, like it did still for me last week), I'll try and grab logs.

@dmoerner
Copy link

I also have this problem with fedora-29 templates and 4.14.116-1. sys-usb always crashes on suspend, and I think it occasionally prevents the system from suspending altogether, as I reported in #4806

@dmoerner
Copy link

I can no longer reproduce this problem with kernel 4.19.43-1.

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.10.109-1.fc32.qubes) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-5-4 (including package kernel-5.4.188-1.fc25.qubes) has been pushed to the r4.0 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.16.18-2.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.16.18-2.fc25.qubes) has been pushed to the r4.0 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.10.109-1.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

marmarek added a commit to QubesOS/qubes-linux-kernel that referenced this issue Jun 11, 2022
@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.15.46-2.fc32.qubes) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.15.52-1.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-6.1.26-1.qubes.fc32) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-6.1.35-1.qubes.fc32) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: power management diagnosed Technical diagnosis has been performed (see issue comments). P: major Priority: major. Between "default" and "critical" in severity. r4.0-dom0-stable r4.1-dom0-stable r4.2-host-stable T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests