New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combination of dom0 busy cpu and suspension triggers soft lock sometimes #7795
Labels
affects-4.1
This issue affects Qubes OS 4.1.
C: kernel
P: default
Priority: default. Default priority for new issues, to be replaced given sufficient information.
T: bug
Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Milestone
Comments
logoerthiner1
added
P: default
Priority: default. Default priority for new issues, to be replaced given sufficient information.
T: bug
Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
labels
Sep 27, 2022
andrewdavidwong
added
C: kernel
needs diagnosis
Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed.
labels
Sep 27, 2022
When I am doing suspend-resuming experiment with one core xen 4.14.5-10 kernel 6.0.2-2
|
Close with #7340 |
andrewdavidwong
removed
the
needs diagnosis
Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed.
label
Nov 24, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects-4.1
This issue affects Qubes OS 4.1.
C: kernel
P: default
Priority: default. Default priority for new issues, to be replaced given sufficient information.
T: bug
Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
How to file a helpful issue
Qubes OS release
R4.1, kernel 5.18.16 / 5.15.64 (I believe that former exhibits the behavior mostly as the kernel is SMP PREEMPT_DYNAMIC)
Brief summary
I am still suffering from #7340 and testing in which scenario the suspension fails. I was testing running the same command
systemctl suspend
over and over under different circumstances, with various number of VM open and variousxenpm
options (#7794), and comparing with .Here I report a behavior related to some soft lock similar to #7696. This can be somewhat reliably reproduced.
When I start zero vms and suspend with only dom0, the suspension success rate is high (but not 100%, the other cases falls in #7340 mostly). When I run a command
python3 -c "while 1:pass"
in another terminal, the suspension goes less smoothly - it does not fail with a #7340 but with a #7696 - after several times of suspension, the later suspensions have long delay or even unable to continue; at one time when this happens (sustemctl suspend
run but the system does not go to sleep), I tried to Ctrl-C the testingpython3
process and the moment it dies the system sleeps. It seems that dom0 cpu usage is special and when dom0 cpu is in constant usage, many operations get unstable.python3 -c "while 1:pass"
is a testing command that does nothing but keep the CPU busy in dom0, however the behavior matches many existing issues and can be seem as a hint on many stability problems with various setups. I wonder if anyone can reproduce similar issues.Steps to reproduce
cpufreq=xen:powersave
(may be related; however without this my suspension experiment does not keep long though)systemctl suspend
to make sure that your suspension works mostlypython3 -c "while 1: pass"
Expected behavior
Suspension works equally and no soft lock happens
Actual behavior
Soft lock occurs more and more frequently: usually after 3 suspensions the system gets unstable with softlock problems
The text was updated successfully, but these errors were encountered: