New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Qubes 3.0] net-vm refuses to boot - Failed to restore PCI config space #1525

Closed
canihavesomecoffee opened this Issue Dec 18, 2015 · 8 comments

Comments

Projects
None yet
2 participants
@canihavesomecoffee

This issue is similar (and probably the same root cause) to my previous one (#1524). This time I managed to find out more however.

Same laptop, this time I installed the stable 3.0 version (using Qubes-R3.0-x86_64-DVD). All went fine, until I created the default vm's. A python error popped up (couldn't copy the error however) in a box which went on for so long that the "ok" button was off-screen. Next I noticed that the net-vm didn't start. Upon investigation it seems to be the PCI device responsible for my ethernet card (which didn't have a cable plugged in at install time).

This is the result when I try to run the VM manually using qvm-start sys-net:

--> Creating volatile image: /var/lib/qubes/servicevms/sys-net/volatile.img...
--> Loading the VM (type = NetVM)...
Traceback (most recent call last):
  File "/usr/bin/qvm-start", line 125, in <module>
    main()
  File "/usr/bin/qvm-start", line 109, in main
    xid = vm.start(verbose=options.verbose, preparing_dvm=options.preparing_dvm, start_guid=not options.noguid, notify_function=tray_notify_generic if options.tray else None)
  File "/usr/lib64/python2.7/site-packages/qubes/modules/005QubesNetVm.py", line 121, in start
    xid=super(QubesNetVm, self).start(**kwargs)
  File "/usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py", line 1773, in start
    self.libvirt_domain.createWithFlags(libvirt.VIR_DOMAIN_START_PAUSED)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1037, in createWithFlags
    if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
libvirt.libvirtError: internal error: Unable to reset PCI device 0000:08:00.0: internal error: Failed to restore PCI config space for 0000:08:00.0

Relevant output from lspci:

...
03:00.0 Network controller: Qualcomm Atheros AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
...
08:00.0 Ethernet controller: Qualcomm Atheros AR8131 Gigabit Ethernet (rev c0)
...

The net-vm only wants to boot when I remove the Ethernet controller from the list of attached devices. I've tried turning it on with a working ethernet-cable plugged in, but at no avail.

Any help in fixing my problem would be appreciated, as I use the ethernet cable from time to time. For now I can experiment furhter on using wireless though.

Possibly related: #1364 and maybe #1393 as well. More information can be provided if needed.

@canihavesomecoffee canihavesomecoffee changed the title from [Qubes 3.0] to [Qubes 3.0] net-vm refuses to boot - Failed to restore PCI config space Dec 18, 2015

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Dec 23, 2015

Member

As a workaround you can set pci_strictreset VM property to false (using qvm-prefs tool). But generally above means that the device most likely doesn't support features (reset in this case) required to safely assign it to the VM.

You can check /var/log/libvirt/libxl/sys-net.log for some more details.

Member

marmarek commented Dec 23, 2015

As a workaround you can set pci_strictreset VM property to false (using qvm-prefs tool). But generally above means that the device most likely doesn't support features (reset in this case) required to safely assign it to the VM.

You can check /var/log/libvirt/libxl/sys-net.log for some more details.

@canihavesomecoffee

This comment has been minimized.

Show comment
Hide comment
@canihavesomecoffee

canihavesomecoffee Dec 27, 2015

Hello,

thanks for the reply (happy holidays!). I've done a little bit of experimentation right now (writing this using Qubes), and I've encountered the next:

  • After setting the pci_strictreset to false, the netvm still refused to boot, and even ensured that a critical error occured, causing the need for a restart. The error that showed up first was:
----
line: if ret == -1: raise libvirtError ('virDomainIsActive() failed', dom=self)
func: isActive
line no.: 1268
file: /usr/lib64/python2.7/site-packages/libvirt.py
----
line: if self.libvirt_domain.isActive():
func: is_running
line no.: 892
file: /usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py
----
line: if not vm.is_running():
func: block_list_vm
line no.: 241
file: /usr/lib64/python2.7/site-packages/qubes/qubesutils.py
----
line: devices_list.update(block_list_vm(vm, system_disks))
func: block_list
line no.: 324
file: /usr/lib64/python2.7/site-packages/qubes/qubesutils.py
----
line: blk = qubesutils.block_list(self.qvm_collection)
func: update
line no.: 81
file: /usr/lib64/python2.7/site-packages/qubesmanager/block.py
----
line: self.update()
func: check_for_updates
line no.: 67
file: /usr/lib64/python2.7/site-packages/qubesmanager/block.py
----
line: res, msg = self.blk_manager.check_for_updates()
func: update_block_devices
line no.: 820
file: /usr/lib64/python2.7/site-packages/qubesmanager/main.py
----
line: update_devs = self.update_block_devices() or out_of_schedule
func: update_table
line no.: 692
file: /usr/lib64/python2.7/site-packages/qubesmanager/main.py

After closing that error every action in the Qubes VM Manager on the net vm resulted in this error:

----
line: if ret == -1: raise libvirtError ('virDomainIsActive() failed', dom=self)
func: isActive
line no.: 1268
file: /usr/lib64/python2.7/site-packages/libvirt.py
----
line: if self.libvirt_domain.isActive():
func: is_running
line no.: 892
file: /usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py
----
line: running = vm.is_running()
func: open_context_menu
line no.: 1601
file: /usr/lib64/python2.7/site-packages/qubesmanager/main.py

However, after a reboot (with the cable plugged in), the net vm now is booting (even with the pci_strictreset back to true), but as soon as a reboot of the net vm is attempted it goes wrong again.

Concerning the libxl log file: I went through it, but can't make a lot of it... These lines are seemingly the ones that would be the most interesting ones:

libxl: error: libxl_device.c:1235:libxl__wait_for_backend: Backend /local/domain/0/backend/pci/1/0 not ready
libxl: debug: libxl_pci.c:174:libxl__device_pci_remove_xenstore: pci backend at /local/domain/0/backend/pci/1/0 is not ready
libxl: error: libxl_pci.c:1244:do_pci_remove: xc_physdev_unmap_pirq irq=17: Invalid argument
libxl: error: libxl_device.c:1235:libxl__wait_for_backend: Backend /local/domain/0/backend/pci/1/0 not ready
libxl: debug: libxl_pci.c:174:libxl__device_pci_remove_xenstore: pci backend at /local/domain/0/backend/pci/1/0 is not ready

The error with the xc_physdev_unmap_pirq is only listed twice in the log file, both at the bottom. I assume these were from when I tried out the pci_strictreset setting. Otherwise the error is exactly the same (minus that single line of course).

If you need more information, or if you'd like me to try other things out, please let me know. I'd really love to help as much as possible 👍

Hello,

thanks for the reply (happy holidays!). I've done a little bit of experimentation right now (writing this using Qubes), and I've encountered the next:

  • After setting the pci_strictreset to false, the netvm still refused to boot, and even ensured that a critical error occured, causing the need for a restart. The error that showed up first was:
----
line: if ret == -1: raise libvirtError ('virDomainIsActive() failed', dom=self)
func: isActive
line no.: 1268
file: /usr/lib64/python2.7/site-packages/libvirt.py
----
line: if self.libvirt_domain.isActive():
func: is_running
line no.: 892
file: /usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py
----
line: if not vm.is_running():
func: block_list_vm
line no.: 241
file: /usr/lib64/python2.7/site-packages/qubes/qubesutils.py
----
line: devices_list.update(block_list_vm(vm, system_disks))
func: block_list
line no.: 324
file: /usr/lib64/python2.7/site-packages/qubes/qubesutils.py
----
line: blk = qubesutils.block_list(self.qvm_collection)
func: update
line no.: 81
file: /usr/lib64/python2.7/site-packages/qubesmanager/block.py
----
line: self.update()
func: check_for_updates
line no.: 67
file: /usr/lib64/python2.7/site-packages/qubesmanager/block.py
----
line: res, msg = self.blk_manager.check_for_updates()
func: update_block_devices
line no.: 820
file: /usr/lib64/python2.7/site-packages/qubesmanager/main.py
----
line: update_devs = self.update_block_devices() or out_of_schedule
func: update_table
line no.: 692
file: /usr/lib64/python2.7/site-packages/qubesmanager/main.py

After closing that error every action in the Qubes VM Manager on the net vm resulted in this error:

----
line: if ret == -1: raise libvirtError ('virDomainIsActive() failed', dom=self)
func: isActive
line no.: 1268
file: /usr/lib64/python2.7/site-packages/libvirt.py
----
line: if self.libvirt_domain.isActive():
func: is_running
line no.: 892
file: /usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py
----
line: running = vm.is_running()
func: open_context_menu
line no.: 1601
file: /usr/lib64/python2.7/site-packages/qubesmanager/main.py

However, after a reboot (with the cable plugged in), the net vm now is booting (even with the pci_strictreset back to true), but as soon as a reboot of the net vm is attempted it goes wrong again.

Concerning the libxl log file: I went through it, but can't make a lot of it... These lines are seemingly the ones that would be the most interesting ones:

libxl: error: libxl_device.c:1235:libxl__wait_for_backend: Backend /local/domain/0/backend/pci/1/0 not ready
libxl: debug: libxl_pci.c:174:libxl__device_pci_remove_xenstore: pci backend at /local/domain/0/backend/pci/1/0 is not ready
libxl: error: libxl_pci.c:1244:do_pci_remove: xc_physdev_unmap_pirq irq=17: Invalid argument
libxl: error: libxl_device.c:1235:libxl__wait_for_backend: Backend /local/domain/0/backend/pci/1/0 not ready
libxl: debug: libxl_pci.c:174:libxl__device_pci_remove_xenstore: pci backend at /local/domain/0/backend/pci/1/0 is not ready

The error with the xc_physdev_unmap_pirq is only listed twice in the log file, both at the bottom. I assume these were from when I tried out the pci_strictreset setting. Otherwise the error is exactly the same (minus that single line of course).

If you need more information, or if you'd like me to try other things out, please let me know. I'd really love to help as much as possible 👍

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Dec 27, 2015

Member

On Sun, Dec 27, 2015 at 12:12:55PM -0800, Willem wrote:

line: if ret == -1: raise libvirtError ('virDomainIsActive() failed', dom=self)

Did you get anything else there, above that message? There should be
actual exception text... I guess something about connection to libvirtd
broken.

However, after a reboot (with the cable plugged in), the net vm now is booting (even with the pci_strictreset back to true), but as soon as a reboot of the net vm is attempted it goes wrong again.

How do you rebooting netvm? If shutting down it from within, it isn't
supported yet (#1426).

It may work when you use qvm-shutdown --force (and sometimes needs
setting netvm back to the same value, like qvm-prefs -s sys-firewall netvm sys-net), but generally it isn't fully supported. The proper way
is to shutdown all the connected VMs first...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Dec 27, 2015

On Sun, Dec 27, 2015 at 12:12:55PM -0800, Willem wrote:

line: if ret == -1: raise libvirtError ('virDomainIsActive() failed', dom=self)

Did you get anything else there, above that message? There should be
actual exception text... I guess something about connection to libvirtd
broken.

However, after a reboot (with the cable plugged in), the net vm now is booting (even with the pci_strictreset back to true), but as soon as a reboot of the net vm is attempted it goes wrong again.

How do you rebooting netvm? If shutting down it from within, it isn't
supported yet (#1426).

It may work when you use qvm-shutdown --force (and sometimes needs
setting netvm back to the same value, like qvm-prefs -s sys-firewall netvm sys-net), but generally it isn't fully supported. The proper way
is to shutdown all the connected VMs first...

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@canihavesomecoffee

This comment has been minimized.

Show comment
Hide comment
@canihavesomecoffee

canihavesomecoffee Dec 29, 2015

Rebooting the netvm is done by shutting down all "upper" (work, personal, ...) vm's, then the firewall-vm and then the netvm through the Qubes VM Manager, followed by the "Start/Resume VM" once the VM Manager indicates that it's shut down.

Regarding the actual error message: I can't remember it exactly, but I'll update this post once I triggered the issue again.

Rebooting the netvm is done by shutting down all "upper" (work, personal, ...) vm's, then the firewall-vm and then the netvm through the Qubes VM Manager, followed by the "Start/Resume VM" once the VM Manager indicates that it's shut down.

Regarding the actual error message: I can't remember it exactly, but I'll update this post once I triggered the issue again.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Dec 29, 2015

Member

On Tue, Dec 29, 2015 at 08:05:52AM -0800, Willem wrote:

Rebooting the netvm is done by shutting down all "upper" (work, personal, ...) vm's, then the firewall-vm and then the netvm through the Qubes VM Manager, followed by the "Start/Resume VM" once the VM Manager indicates that it's shut down.

Regarding the actual error message: I can't remember it exactly, but I'll update this post once I triggered the issue again.

Good news: I think I know what is wrong. Generally take a look at this
discussion:
#1038 (comment)
#1038 (comment)

Bad news: we don't have solution for that, other than "do not restart
netvm".

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Dec 29, 2015

On Tue, Dec 29, 2015 at 08:05:52AM -0800, Willem wrote:

Rebooting the netvm is done by shutting down all "upper" (work, personal, ...) vm's, then the firewall-vm and then the netvm through the Qubes VM Manager, followed by the "Start/Resume VM" once the VM Manager indicates that it's shut down.

Regarding the actual error message: I can't remember it exactly, but I'll update this post once I triggered the issue again.

Good news: I think I know what is wrong. Generally take a look at this
discussion:
#1038 (comment)
#1038 (comment)

Bad news: we don't have solution for that, other than "do not restart
netvm".

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@canihavesomecoffee

This comment has been minimized.

Show comment
Hide comment
@canihavesomecoffee

canihavesomecoffee Dec 29, 2015

Ok, thanks for that information, I'll read through it.

Concerning the other error you were asking for, I've done the same again (restarting the netvm with the pci_strictreset to false), and noted the actual error this time:

libvirtError: internal error: client socket is closed
at line 1268
of file libvirt.py.

This error message is the same for both stack traces.

Ok, thanks for that information, I'll read through it.

Concerning the other error you were asking for, I've done the same again (restarting the netvm with the pci_strictreset to false), and noted the actual error this time:

libvirtError: internal error: client socket is closed
at line 1268
of file libvirt.py.

This error message is the same for both stack traces.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Dec 30, 2015

Member

The above error looks to be #990
Try restarting qubes-manager (to really close it, choose exit option from its icon context meny in tray).

Member

marmarek commented Dec 30, 2015

The above error looks to be #990
Try restarting qubes-manager (to really close it, choose exit option from its icon context meny in tray).

@marmarek

This comment has been minimized.

Show comment
Hide comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment