Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upStubdomain related bugs/todos #2849
Comments
marmarek
added
bug
C: xen
P: major
labels
Jun 7, 2017
marmarek
added this to the Release 4.0 milestone
Jun 7, 2017
added a commit
to marmarek/qubes-vmm-xen-stubdom-linux
that referenced
this issue
Jun 13, 2017
added a commit
to marmarek/qubes-vmm-xen-stubdom-linux
that referenced
this issue
Jun 19, 2017
added a commit
to HW42/qubes-vmm-xen
that referenced
this issue
Jul 1, 2017
added a commit
to HW42/qubes-vmm-xen
that referenced
this issue
Jul 1, 2017
added a commit
to HW42/qubes-vmm-xen
that referenced
this issue
Jul 1, 2017
HW42
referenced this issue
in QubesOS/qubes-vmm-xen
Jul 1, 2017
Merged
linux-stubdom improvements #9
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
HW42
Jul 1, 2017
Bugs:
- dynamic network attach/detach fails (libxl complains about qemu in dom0 not running)
Fixed by QubesOS/qubes-vmm-xen@c747117 (see QubesOS/qubes-vmm-xen#9). The fix simply skips the check for dom0 qemu. As long as the domain has PV drivers hotplugging still works. This is the same behavior as in R3.2.
- domain suspend does not suspend/pause stubdomain; domain pause does
Fixed by QubesOS/qubes-vmm-xen@441411d (see QubesOS/qubes-vmm-xen#9).
- PCI hotplugging
See https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg02012.html and https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg02013.html
Missing features (optional):
- OVMF (EFI) support - #2577 (comment)
Partially fixed by QubesOS/qubes-vmm-xen@e1a7d1b (see QubesOS/qubes-vmm-xen#9). Remaining problems:
- It seems Xen currently doesn't support persistent NVRAM variables. This is going to cause problems with classic OSs (for our templates this doesn't matter). Probably the best thing is to simply provide a toggle to switch back to traditional BIOS.
- OVMF triggers a bug in qemu gui-agent. I will add the EFI option to qubes-admin as soon as this is fixed.
HW42
commented
Jul 1, 2017
Fixed by QubesOS/qubes-vmm-xen@c747117 (see QubesOS/qubes-vmm-xen#9). The fix simply skips the check for dom0 qemu. As long as the domain has PV drivers hotplugging still works. This is the same behavior as in R3.2.
Fixed by QubesOS/qubes-vmm-xen@441411d (see QubesOS/qubes-vmm-xen#9).
See https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg02012.html and https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg02013.html
Partially fixed by QubesOS/qubes-vmm-xen@e1a7d1b (see QubesOS/qubes-vmm-xen#9). Remaining problems:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 1, 2017
Member
with classic OSs (for our templates this doesn't matter). Probably the best thing is to simply provide a toggle to switch back to traditional BIOS.
OVMF seems to create NVRAM file on ESP. So I think it isn't an issue at all.
On the other hand, I've failed to setup it to boot ESP/EFI/BOOT/BOOTX64.efi automatically. I always needed to launch it manually from EFI shell.
BTW I have a "64bit-qemu" branch on my account. Not sure if really needed, but I've got some issue about inability to run 64bit code on 32bit CPU...
OVMF triggers a bug in qemu gui-agent. I will add the EFI option to qubes-admin as soon as this is fixed.
What bug? I think I had it working in my very hacky setup...
Anyway, any idea about that USB controller issue? If you can't reproduce it on your hardware, I can give you SSH access to the machine having this issue.
OVMF seems to create NVRAM file on ESP. So I think it isn't an issue at all.
What bug? I think I had it working in my very hacky setup... Anyway, any idea about that USB controller issue? If you can't reproduce it on your hardware, I can give you SSH access to the machine having this issue. |
qubesos-bot
referenced this issue
in QubesOS/updates-status
Jul 2, 2017
Closed
vmm-xen v4.8.1-3 (r4.0) #94
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 4, 2017
Member
Added new issue:
- dynamic block attach/detach fails (libxl complains about qemu in dom0 not running)
I wonder what would happen if we silence this at libxl__dm_check_start, or libxl__need_xenpv_qemu level, instead of each call-site of those? Quick grep doesn't reveal anything alarming. @HW42 ?
|
Added new issue:
I wonder what would happen if we silence this at |
qubesos-bot
referenced this issue
in QubesOS/updates-status
Jul 5, 2017
Closed
vmm-xen-stubdom-linux v1.0.1 (r4.0) #108
added a commit
to HW42/qubes-core-admin
that referenced
this issue
Jul 5, 2017
added a commit
to HW42/qubes-vmm-xen
that referenced
this issue
Jul 5, 2017
HW42
referenced this issue
in QubesOS/qubes-vmm-xen
Jul 5, 2017
Merged
stubdom-linux, libxl: disable check for dom0 qemu #10
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
HW42
Jul 5, 2017
with classic OSs (for our templates this doesn't matter). Probably the best thing is to simply provide a toggle to switch back to traditional BIOS.
OVMF seems to create NVRAM file on ESP. So I think it isn't an issue at all.
If I edit the boot order with efibootmgr the change is not preserved. Also the Debian stretch installer didn't showed any error but the boot after the installtion did not work. Running grub-install --removable /dev/xvda resolved the issue. So I think it is a problem for not Qubes aware OSs. It's also not clear how OVMF should choose from which ESP it should read the NVRAM before deciding what to boot.
On the other hand, I've failed to setup it to boot ESP/EFI/BOOT/BOOTX64.efi automatically. I always needed to launch it manually from EFI shell.
Did you try to install the bootloader in "removable" mode.
BTW I have a "64bit-qemu" branch on my account. Not sure if really needed, but I've got some issue about inability to run 64bit code on 32bit CPU...
Ah ok, this part comes directly from Eric's patch. I didn't experienced any problems sofar.
New TODO:
- check if bootdelay in OVMF can be reduced
OVMF triggers a bug in qemu gui-agent. I will add the EFI option to qubes-admin as soon as this is fixed.
What bug? I think I had it working in my very hacky setup...
OVMF enables a video mode in which qemu uses the video ram of the target domain directly as frame buffer (mapped via xc_map_foreign_bulk). For this mapping u2mfn does not work without the fix from QubesOS/qubes-linux-utils#15.
- dynamic block attach/detach fails (libxl complains about qemu in dom0 not running)
I wonder what would happen if we silence this at
libxl__dm_check_start, orlibxl__need_xenpv_qemulevel, instead of each call-site of those? Quick grep doesn't reveal anything alarming. @HW42 ?
For libxl__dm_check_start this looks fine, see QubesOS/qubes-vmm-xen#10. libxl__need_xenpv_qemu is only checked at startup, so if it says true a workaround is probably more complex than silencing the warning.
HW42
commented
Jul 5, 2017
If I edit the boot order with
Did you try to install the bootloader in "removable" mode.
Ah ok, this part comes directly from Eric's patch. I didn't experienced any problems sofar. New TODO:
OVMF enables a video mode in which qemu uses the video ram of the target domain directly as frame buffer (mapped via
For |
qubesos-bot
referenced this issue
in QubesOS/updates-status
Jul 6, 2017
Closed
vmm-xen v4.8.1-4 (r4.0) #127
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 7, 2017
Member
Working OVMF is a nice thing to have, but currently much higher priority is fixing PCI passthrough. Broken MSI support looks like a common pattern, from another machine:
[ 1.862759] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[ 1.862759] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 1.863099] xen: --> pirq=16 -> irq=36 (gsi=36)
[ 1.864743] e1000e 0000:00:05.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[ 1.864745] e1000e 0000:00:05.0 0000:00:05.0 (uninitialized): Failed to initialize MSI interrupts. Falling back to legacy interrupts.
[38413.296732] Intel(R) Wireless WiFi driver for Linux
[38413.296767] Copyright(c) 2003- 2015 Intel Corporation
[38413.298211] xen: --> pirq=16 -> irq=40 (gsi=40)
[38413.301052] iwlwifi 0000:00:06.0: pci_enable_msi failed - -22
[38413.301383] xen:events: Failed to obtain physical IRQ 40
[38413.303009] iwlwifi 0000:00:06.0: Direct firmware load for iwlwifi-8000C-28.ucode failed with error -2
[38413.309344] iwlwifi 0000:00:06.0: capa flags index 3 larger than supported by driver
[38413.309908] iwlwifi 0000:00:06.0: loaded firmware version 27.455470.0 op_mode iwlmvm
[38413.362149] iwlwifi 0000:00:06.0: Detected Intel(R) Dual Band Wireless AC 8260, REV=0x208
[38413.365900] iwlwifi 0000:00:06.0: L1 Disabled - LTR Disabled
[38413.367897] iwlwifi 0000:00:06.0: L1 Disabled - LTR Disabled
[38418.581641] iwlwifi 0000:00:06.0: Failed to load firmware chunk!
[38418.581772] iwlwifi 0000:00:06.0: Could not load the [0] uCode section
[38418.581913] iwlwifi 0000:00:06.0: Failed to start INIT ucode: -110
[38418.587150] iwlwifi 0000:00:06.0: Failed to run INIT ucode: -110
[38418.589083] iwlwifi 0000:00:06.0: L1 Disabled - LTR Disabled
|
Working OVMF is a nice thing to have, but currently much higher priority is fixing PCI passthrough. Broken MSI support looks like a common pattern, from another machine:
|
added a commit
to marmarek/qubes-mgmt-salt-dom0-virtual-machines
that referenced
this issue
Jul 18, 2017
qubesos-bot
referenced this issue
in QubesOS/updates-status
Jul 18, 2017
Closed
mgmt-salt-dom0-virtual-machines v4.0.2 (r4.0) #138
added a commit
to HW42/qubes-vmm-xen-stubdom-linux
that referenced
this issue
Jul 24, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 28, 2017
Member
I noticed stubdomains use about 10% CPU constantly. Haven't investigated yet.
|
I noticed stubdomains use about 10% CPU constantly. Haven't investigated yet. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
HW42
Jul 28, 2017
I noticed stubdomains use about 10% CPU constantly. Haven't investigated yet.
Seems to be not directly stubdom related but a problem with qemu. If I run qemu in dom0 the cpu usage (according to xl top) of dom0 raises by about 10 %. top in dom0 reports roughly 8 % for the qemu process.
HW42
commented
Jul 28, 2017
Seems to be not directly stubdom related but a problem with qemu. If I run qemu in dom0 the cpu usage (according to |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 28, 2017
Member
Seems to be not directly stubdom related but a problem with qemu. If I run qemu in dom0 the cpu usage (according to xl top) of dom0 raises by about 10 %. top in dom0 reports roughly 8 % for the qemu process.
Probably worth investigating, later.
Another thing I noticed suspend of HVM with PCI fails - it never really suspend (but there are some related messages from VM kernel), and finally libxl timeouts. For HVM without PCI devices it works.
Maybe related to other PCI problems?
BTW What do you think about having shell in stubdomain by default (instead of sleep loop)? Debugging will be easier, do you see any downsides?
Another thing I noticed suspend of HVM with PCI fails - it never really suspend (but there are some related messages from VM kernel), and finally libxl timeouts. For HVM without PCI devices it works. BTW What do you think about having shell in stubdomain by default (instead of sleep loop)? Debugging will be easier, do you see any downsides? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Aug 14, 2017
Member
One more issue: #2951 (it makes almost impossible to install win7)
Looks like libxl logic issue regarding qemu-upstream/qemu-traditional features.
|
One more issue: #2951 (it makes almost impossible to install win7) |
added a commit
to HW42/qubes-core-admin
that referenced
this issue
Sep 14, 2017
added a commit
to HW42/qubes-core-admin
that referenced
this issue
Sep 14, 2017
added a commit
to HW42/qubes-core-admin
that referenced
this issue
Sep 15, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Sep 22, 2017
Member
Hmm, I see we have xenstore-read available in stubdom. Maybe it would be good idea to fix qvm-prefs kernelopts for HVM (instead of cmdline.txt file)? Currently kernelopts for HVM are not written to xenstore, but it should be doable. @HW42 what do you think?
|
Hmm, I see we have xenstore-read available in stubdom. Maybe it would be good idea to fix qvm-prefs kernelopts for HVM (instead of |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
HW42
Sep 27, 2017
Hmm, I see we have xenstore-read available in stubdom. Maybe it would be good idea to fix qvm-prefs kernelopts for HVM (instead of cmdline.txt file)? Currently kernelopts for HVM are not written to xenstore, but it should be doable. @HW42 what do you think?
Turned out to be a little bit trickier than I first thought. The problem is that the stubdom is (intentionally) directly started when you create a domain (even with VIR_DOMAIN_START_PAUSED) and before creating the domain the xenstore path doesn't exists yet. So I had to patch Xen. See QubesOS/qubes-vmm-xen#17, QubesOS/qubes-vmm-xen-stubdom-linux#6 and QubesOS/qubes-core-admin#151.
I dropped the unused cmdline.txt support. Do you see a use for it? If yes what should be the behavior (overwriting, merging, ...)?
HW42
commented
Sep 27, 2017
Turned out to be a little bit trickier than I first thought. The problem is that the stubdom is (intentionally) directly started when you create a domain (even with I dropped the unused |
qubesos-bot
referenced this issue
in QubesOS/updates-status
Sep 27, 2017
Closed
vmm-xen-stubdom-linux v1.0.2 (r4.0) #239
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Oct 2, 2017
Member
Just tested sys-usb with MSI patches applied. It fixes sys-usb as HVM generally, also on other machines. But still sys-usb hang after suspend:
[ 77.992077] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 77.993575] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 77.997918] PM: freeze of devices complete after 3.027 msecs
[ 77.998408] PM: late freeze of devices complete after 0.464 msecs
[ 78.017709] PM: noirq freeze of devices complete after 19.275 msecs
[ 78.018020] xen:events: Xen HVM callback vector for event delivery is enabled
[ 78.018020] xen:grant_table: Grant tables using version 1 layout
[ 78.018020] Suspended for 249.976 seconds
[ 78.021274] PM: noirq thaw of devices complete after 3.450 msecs
[ 78.021452] PM: early thaw of devices complete after 0.126 msecs
[ 78.021606] rtc_cmos 00:02: System wakeup disabled by ACPI
[ 78.067039] xhci_hcd 0000:00:05.0: port 8 resume PLC timeout
[ 78.067039] xhci_hcd 0000:00:05.0: port 6 resume PLC timeout
[ 78.067039] xhci_hcd 0000:00:05.0: port 5 resume PLC timeout
And at this point sys-usb console is not responsive (nor any other application there).
|
Just tested sys-usb with MSI patches applied. It fixes sys-usb as HVM generally, also on other machines. But still sys-usb hang after suspend:
And at this point sys-usb console is not responsive (nor any other application there). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Oct 5, 2017
Member
I dropped the unused cmdline.txt support. Do you see a use for it? If yes what should be the behavior (overwriting, merging, ...)?
Yes, lets drop it. It was a workaround for missing kernelopts support.
Yes, lets drop it. It was a workaround for missing kernelopts support. |
qubesos-bot
referenced this issue
in QubesOS/updates-status
Oct 7, 2017
Closed
core-admin v4.0.8 (r4.0) #250
added a commit
to marmarek/qubes-mgmt-salt-dom0-virtual-machines
that referenced
this issue
Oct 8, 2017
qubesos-bot
referenced this issue
in QubesOS/updates-status
Oct 8, 2017
Closed
mgmt-salt-dom0-virtual-machines v4.0.6 (r4.0) #252
andrewdavidwong
modified the milestones:
Release 4.0,
Release 4.0 updates
Mar 31, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
dylangerdaly
Jul 7, 2018
Hey guys,
This still seems to be an issue, every stubdomain running is taking about 10% of my CPU Usage, it's possible to switch back to Xen's Mini-OS however sys-net files to load.
It seems to be an issue with qemu in the linux stub. Any idea how to track down the problem further, or revert to Mini-OS stub?
dylangerdaly
commented
Jul 7, 2018
|
Hey guys, This still seems to be an issue, every stubdomain running is taking about 10% of my CPU Usage, it's possible to switch back to Xen's Mini-OS however sys-net files to load. It seems to be an issue with qemu in the linux stub. Any idea how to track down the problem further, or revert to Mini-OS stub? |
marmarek commentedJun 7, 2017
•
edited
Edited 2 times
-
marmarek
edited Nov 20, 2017 (most recent)
-
marmarek
edited Sep 6, 2017
Qubes OS version (e.g.,
R3.2): R4.0A single ticket for tracking Linux stubdomain-related bugs and missing features. The list below is intended to be updated - but when you do so, add also a comment to trigger notification email.
Bugs:
Missing features (optional):
/cc @HW42