Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upFrequent VM startup failures - R4rc2 #3221
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
Oct 26, 2017
Issue #3125 regards 'libxenlight' errors which I'm not seeing.
Also, this startup problem occurs with regular appVMs as much as network-providing or device-mapped VMs. So if sys-net and sys-firewall are running OK, an appVM that uses sys-firewall (or no netvm at all) may still fail to start.
tasket
commented
Oct 26, 2017
|
Issue #3125 regards 'libxenlight' errors which I'm not seeing. Also, this startup problem occurs with regular appVMs as much as network-providing or device-mapped VMs. So if sys-net and sys-firewall are running OK, an appVM that uses sys-firewall (or no netvm at all) may still fail to start. |
andrewdavidwong
added
bug
C: core
labels
Oct 27, 2017
andrewdavidwong
added this to the Release 4.0 milestone
Oct 27, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
Oct 27, 2017
There is an emerging pattern (and workaround) to what I'm experiencing:
On boot, sys-net will usually start but sys-firewall or VPN (these both connect to sys-net) will fail, and any appVMs that use these proxyVMs will also fail. Non-connected appVMs may or may not start at this point. If I keep re-trying different VMs, I may get sys-firewall or VPN to run but downstream appVMs can't access the net.
However, if I shut down sys-net along with all the other VMs, I can then start VMs with much more reliability: I can start an appVM, and then sys-net and VPN or sys-firewall will start and run properly.
A memory management issue may be related to this... I have noticed sometimes appVMs lose the ability to acquire more RAM despite plenty available, resulting in the appVM swapping heavily when demand increases. But it may be the case when I re-start sys-net like above, the new VM instances retain their ability to gain (and relinquish) RAM; that is how my system is behaving now.
tasket
commented
Oct 27, 2017
|
There is an emerging pattern (and workaround) to what I'm experiencing: On boot, sys-net will usually start but sys-firewall or VPN (these both connect to sys-net) will fail, and any appVMs that use these proxyVMs will also fail. Non-connected appVMs may or may not start at this point. If I keep re-trying different VMs, I may get sys-firewall or VPN to run but downstream appVMs can't access the net. However, if I shut down sys-net along with all the other VMs, I can then start VMs with much more reliability: I can start an appVM, and then sys-net and VPN or sys-firewall will start and run properly. A memory management issue may be related to this... I have noticed sometimes appVMs lose the ability to acquire more RAM despite plenty available, resulting in the appVM swapping heavily when demand increases. But it may be the case when I re-start sys-net like above, the new VM instances retain their ability to gain (and relinquish) RAM; that is how my system is behaving now. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
na--
Oct 27, 2017
A memory management issue may be related to this... I have noticed sometimes appVMs lose the ability to acquire more RAM despite plenty available, resulting in the appVM swapping heavily when demand increases.
I've not experienced any of the other issues you described, but that memory management issue happened to me a few times and I've not managed to reliably replicate it yet. Although, now that I think about it, sys-net usually fails to start at boot, even though it should, while sys-usb starts normally.
na--
commented
Oct 27, 2017
•
I've not experienced any of the other issues you described, but that memory management issue happened to me a few times and I've not managed to reliably replicate it yet. Although, now that I think about it, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pietrushnic
Oct 28, 2017
I also have problems with sys-net. During installations of rc2 I marked to handle USB in sys-net and my PCI USB device is connected there. When I disconnect USB controller sys-net starts fine, but I have no USB access. The log that I see is when trying to start from dom0:
Start failed: internal error: Unable to reset PCI device 0000:00:14.0: internal error: libxenlight failed to create new domain 'sys-net'
If I try to connect USB controller after boot and then start sys-net I get:
Start failed: internal error: Unable to reset PCI device 0000:00:14.0: no FLR, PM reset or bus reset available
Not sure if this is related. This behavior didn't appear on R4-rc1.
pietrushnic
commented
Oct 28, 2017
|
I also have problems with
If I try to connect USB controller after boot and then start
Not sure if this is related. This behavior didn't appear on R4-rc1. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
aphidfarmer
Oct 28, 2017
Same here...
- sys-net always starts automatically with no issues
- sys-firewall often cannot start and eventually fails with qrexec error.
Even if sys-firewall does start successfully, I have similar issues with AppVMs (with no pci devices) connected to sys-firewall.
aphidfarmer
commented
Oct 28, 2017
•
|
Same here...
Even if sys-firewall does start successfully, I have similar issues with AppVMs (with no pci devices) connected to sys-firewall. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Oct 28, 2017
Member
How much memory does the system have? Try adjusting initial memory (increase it), or maxmem (decrease it).
|
How much memory does the system have? Try adjusting initial memory (increase it), or maxmem (decrease it). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
The above comment is to check relation to #2853 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
aphidfarmer
Oct 29, 2017
sys-firewall: 600MB initial, 1GiB max
appvm that sometimes also fails to start: 400MB initial, 2GiB max; same thing if I change to 1GiB/2 GiB.
physical memory: 8 GiB
As tasket mentioned, the issue seems to affect VMs downstream from sys-firewall. If I change my AppVM's netvm from sys-firewall to none, it starts always. If I change it back, startup fails most of the time.
aphidfarmer
commented
Oct 29, 2017
•
|
sys-firewall: 600MB initial, 1GiB max As tasket mentioned, the issue seems to affect VMs downstream from sys-firewall. If I change my AppVM's netvm from sys-firewall to none, it starts always. If I change it back, startup fails most of the time. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Ok, so this is something different. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ghost
commented
Oct 29, 2017
|
Having same issiue but starting a VM multiple times doesn't help |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
Oct 31, 2017
@marmarek
sys-net, sys-firewall and VPN are limited to 400MB with no balancing. All the other VMs have the default 400/3940MB with balancing. I will try increasing min to 600MB on appVMs.
One trick that has worked over the last 2 days:
A small appVM (300/400 RAM) that is isolated (no netvm) has a good chance of starting and then I can subsequently start other, connected VMs.
tasket
commented
Oct 31, 2017
|
@marmarek One trick that has worked over the last 2 days: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
commented
Oct 31, 2017
|
Also, overall system RAM is 8GB. |
added a commit
to HW42/qubes-vmm-xen-stubdom-linux
that referenced
this issue
Nov 7, 2017
HW42
referenced this issue
in QubesOS/qubes-vmm-xen-stubdom-linux
Nov 7, 2017
Merged
qemu: Add upstream patches for RCU deadlock #9
qubesos-bot
referenced this issue
in QubesOS/updates-status
Nov 14, 2017
Closed
vmm-xen-stubdom-linux v1.0.5 (r4.0) #295
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
aphidfarmer
Nov 15, 2017
After qubes-dom0-update, VMs now start consistently for me (whereas before, startup would fail half the time).
aphidfarmer
commented
Nov 15, 2017
|
After qubes-dom0-update, VMs now start consistently for me (whereas before, startup would fail half the time). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tasket
Nov 18, 2017
I'm having similar good luck for the last 24 hrs. since the update, but I'm still keeping my fingers crossed. :)
tasket
commented
Nov 18, 2017
•
|
I'm having similar good luck for the last 24 hrs. since the update, but I'm still keeping my fingers crossed. :) |
tasket commentedOct 26, 2017
•
edited
Edited 1 time
-
tasket
edited Oct 31, 2017 (most recent)
Qubes OS version:
R4rc2
Affected TemplateVMs:
fedora-25
debian-8
Steps to reproduce the behavior:
Start any VM using these templates.
Expected behavior:
VM starts and responds to commands.
Actual behavior:
Desktop notification that VM is starting, but there is relatively little disk activity and the VM menu widget shows a busy indicator for that VM until a minute later when the VM disappears.
This happens close to 50% of the time.
General notes:
Trying to start the VM over and over can get the VM running.
Discussion thread here:
https://groups.google.com/d/msgid/qubes-users/g5tLp_yA2-jvKvSkZmQyCEJU50NS6aWb7m1Dmezb6d1y2loGsi-fh1pSgK5Jk2ovnwECfmVVAym11iFX7CbaAdGvX_iKZWOvKzzAF4eEcsE%3D%40protonmail.com
Related issues: