Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM does not start - Qubes R4 - qrexec and sudo xl console vmname - broken at the same time in any Debian based VMs #3187

Closed
adrelanos opened this issue Oct 19, 2017 · 17 comments
Labels
C: core C: Debian/Ubuntu C: Xen T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Milestone

Comments

@adrelanos
Copy link
Member

Qubes OS version:

R4 RC1 with all upgrades from Qubes testing

Affected TemplateVMs:

debian-stretch

(Cloned debian-8 template, upgraded to stretch and Qubest testing.)

Steps to reproduce the behavior:

Unclear. Happening after running for a while. First everything works, then no more VMs can be started.

Speculation: Might happen when starting too many VMs too quickly. Multiple starts/shutdowns of VMs too quickly.

Does a long running qrexec such as for copying new packages downloaded using qubes-dom0-update from UpdateVM to dom0 like it could block other qrexec (such as starting VMs) operations?

Expected behavior:

Qrexec (starting VMs functional) / sudo xl console vmname functional.

Actual behavior:

Even sudo xl console vmname does not work. Shows:

Could not read tty from store: No such file or directory

Or after using qvm-shutdown vm-name followed by qvm-start vm-name.

vm-name is an invalid domain identifier (rc=-6)

qvm-start vmname not doing anything. It outputs nothing and exits with return code 0.

General Notes:

What debug output would be useful?

@0spinboson
Copy link

0spinboson commented Oct 19, 2017

(just repeating marek's suggestion made to me elsewhere: have you also tried sudo xl console -t pv vmname ?)
no idea about the broader issue, though. no issues in dom0 journalctl -xb?

@marmarek
Copy link
Member

Also see /var/log/xen/console/guest-debian-stretch.log - maybe some crash during startup?

@andrewdavidwong andrewdavidwong added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: core C: Debian/Ubuntu C: Xen labels Oct 19, 2017
@andrewdavidwong andrewdavidwong added this to the Release 4.0 milestone Oct 19, 2017
@adrelanos
Copy link
Member Author

I failed to reproduce this issue for some hours. Now I managed to reproduce. It's not a debian-stretch issue since it also happened with debian-8 and whonix-ws TemplateBased AppVMs.

In order to reproduce it I did run qvm-run --all "sudo poweroff" followed by qvm-start browsing. The latter command results in a chain of VMs starting, i.e. sys-net, sys-firewall and finally browsing. That triggered that bug.

  • VM browsing didn't start.
  • VM personal did start.

(Starting VM personal after that just worked normally.)


qvm-start browsing

Takes a long time then shows.

Cannot execute qrexec-daemon!

Before it shows that error it is in qvm-ls shows status transient.


  • sudo xl console -t pv browsing runs, shows nothings and then exit with return code 0 back to shell.
  • sudo xl console -t pv personal just runs as expected.

  • /var/log/xen/console/guest-browsing.log is empty.
  • /var/log/xen/console/guest-personal.log is non-empty.

  • /var/log/xen/console/guest-browsing-dm.log is non-empty.
  • /var/log/xen/console/guest-personal-dm.log is non-empty.

Managed to grab a journal while a VM was not starting.

Btw what is the difference between guest-vmname.log and guest-vmname-dm.log?

I send the logs by private e-mail to @marmarek.

Can you make any head or tail of it? Anything else I can do to help?

@adrelanos adrelanos changed the title qrexec and sudo xl console vmname - broken at the same time in debian-stretch template qrexec and sudo xl console vmname - broken at the same time in any Debian based VMs Oct 20, 2017
@marmarek
Copy link
Member

@HW42 any idea? I've seen something similar previously. I think it was while debugging grub in HVM. When Linux kernel was started quickly, VM sometimes crashed/hung before producing any output from Linux kernel. But when I added some timeout in grub, it happened much rarer, or at all. It is still happening from time to time on my system, but since no output is produced in such case, it is hard to debug. Mostly while running automated tests (hundreds of VM starts, happens like in 1% of cases).

Btw what is the difference between guest-vmname.log and guest-vmname-dm.log?

The -dm one is about stubdomain hosting qemu process.

@adrelanos
Copy link
Member Author

As a temporary workaround (attempt)...

How do I increase grub timeout?

Inside the VM, I adjusted timeout to 30 (just something high to make sure it works) in /etc/default/grub.d/30-qubes.cfg, did run sudo update-grub and saw that timeout changes took effect in /boot/grub/grub.cfg. Yet, I never managed to see grub on sudo xl console.

@marmarek
Copy link
Member

marmarek commented Oct 25, 2017 via email

@adrelanos
Copy link
Member Author

When I set it to None in GUI, it will always be reset to previous value. Do we have a bug report for this? (Just now upgraded with latest packages from Qubes testing and rebooted.)

Did...

qvm-prefs --set vmname kernel ""

VM startup took long, but couldn't see grub boot menu in xl console. Any advice?

@marmarek
Copy link
Member

marmarek commented Oct 25, 2017 via email

@HW42
Copy link

HW42 commented Oct 26, 2017

@HW42 any idea?

Not really. But I think I have seen this at least once. For some reason the VM won't start at all (but the stubdomain runs). I try to reproduce it.

In case somebody sees this again it would be good to post:

  • Output of xl list ${vm}{,-dm}
  • /var/log/xen/console/guest-${vm}{,-dm}.log since qvm-start
  • /var/log/xen/console/hypervisor.log since qvm-start with loglvl=debug guest_loglvl=debug

@adrelanos
Copy link
Member Author

Logs sent to @marmarek and @HW42. Let me know if it is the correct logs, and if something else would be useful to debug this.

@adrelanos
Copy link
Member Author

Once one VM (debian-stretch) refused to start, it will refuse consistently to restart. Meanwhile, starting different VMs might work (debian-stretch-test).

After enabling debug mode to access it on VGA console, I could see grub boot screen. I noticed /var/log/xen/console/guest-debian-stretch.log only starts after grub boot menu, but that might be just normal.

After line [ 6.202750] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input2in both VGA console and guest-debian-stretch.log, ~ 15 lines of Being: Running /scripts/local-block ... done. are shown in VGA console only, not in guest-debian-stretch.log. Then the VGA console window just closes (and qvm-ls shows the VM stopped).

When I tried this another time, the same as above happened, but I could see a bit more.

Giving up waiting for suspend/resume device
done.
done.
Giving up waiting for root file system device. Common problems:
...
...
BusyBox...

(initramfs)

and automatically terminated.

@HW42
Copy link

HW42 commented Oct 26, 2017

@adrelanos:

Logs sent to @marmarek and @HW42. Let me know if it is the correct logs, and if something else would be useful to debug this.

Thanks. loglvl=debug guest_loglvl=debug aren't options for qvm-start but Xen cmdline options (add them in /etc/default/grub to GRUB_CMDLINE_XEN_DEFAULT and reboot). But no need to redo this as long as you see the root fs error (see below).


The rootfs not found error is something else than I have been thinking about. And probably something else than @marmarek described (since here we see kernel output).

@adrelanos what kernel and initramfs do you have in this template? Is it new enough to have the updated script for rootfs mounting in initramfs?

@HW42
Copy link

HW42 commented Oct 26, 2017

~ 15 lines of Being: Running /scripts/local-block ... done. are shown in VGA console only, not in guest-debian-stretch.log.

If you want this output also in the xen guest console you need to add console=hvc0 in your grub config.

@adrelanos
Copy link
Member Author

@adrelanos what kernel and initramfs do you have in this template?

It's a debian-8 template, cloned and upgraded to debian-stretch.

ii  linux-image-4.9.0-4-amd64                     4.9.51-1                                   amd64        Linux 4.9 for 64-bit PCs
ii  linux-image-amd64                             4.9+80+deb9u2                              amd64        Linux for 64-bit PCs (meta-package)
ii  initramfs-tools                               0.130                                      all          generic modular initramfs generator (automation)

Is it new enough to have the updated script for rootfs mounting in initramfs?

Does above look recent enough?

Probably a separate bug indeed. Perhaps no one has tried a debian-stretch with VM kernel yet?


hypervisor.log is now populated (without root fs error - which is easy to disable - by disabling VM kernel and debug mode). Sent to @marmarek and @HW42.

Should I redo the other logs or provide something else?

@marmarek
Copy link
Member

Do you have qubes-kernel-vm-support package installed? Try regenerating initramfs.

@adrelanos adrelanos changed the title qrexec and sudo xl console vmname - broken at the same time in any Debian based VMs VM does not start - Qubes R4 - qrexec and sudo xl console vmname - broken at the same time in any Debian based VMs Oct 31, 2017
HW42 added a commit to HW42/qubes-vmm-xen-stubdom-linux that referenced this issue Nov 7, 2017
@HW42
Copy link

HW42 commented Nov 18, 2017

@adrelanos do you still have startup problems with xen-hvm-stubdom-linux 1.0.5 (since a few days in the testing repo)?

@adrelanos
Copy link
Member Author

Progress here, how awesome! :)

I didn't look into this ticket for a few days and didn't test my Qubes R4 notebook much. Too bad git commits referenced here don't e-mail notify me. Didn't notice there the updated package that may fix it.

Anyhow. For the little testing I did, I didn't notice any VM startup issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core C: Debian/Ubuntu C: Xen T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

5 participants