New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System shutdown hang #1581

Open
marmarek opened this Issue Jan 4, 2016 · 20 comments

Comments

Projects
None yet
9 participants
@marmarek
Member

marmarek commented Jan 4, 2016

Sometimes Qubes R3.1 shutdown never finishes.

@marmarek marmarek added this to the Release 3.1 milestone Jan 4, 2016

@taradiddles

This comment has been minimized.

Show comment
Hide comment
@taradiddles

taradiddles Jan 12, 2016

Maybe it's a different issue - in my case it finishes but takes a long time because /tmp is busy and cannot be unmounted, which in turn prevents luks from being closed, which then prevents the underlying device from being stopped, which triggers a handful of timeouts. Eventually systemd shutdowns/reboots the machine though. The issue is reproducible, I get that problem each time I shutdown.
Note: Qubes' install was done with auto partitioning - no fancy custom stuff here.

log's excerpt attached: journal-shutdown.txt

Maybe it's a different issue - in my case it finishes but takes a long time because /tmp is busy and cannot be unmounted, which in turn prevents luks from being closed, which then prevents the underlying device from being stopped, which triggers a handful of timeouts. Eventually systemd shutdowns/reboots the machine though. The issue is reproducible, I get that problem each time I shutdown.
Note: Qubes' install was done with auto partitioning - no fancy custom stuff here.

log's excerpt attached: journal-shutdown.txt

@cfcs

This comment has been minimized.

Show comment
Hide comment
@cfcs

cfcs Jan 12, 2016

I encounter this occasionally on R2 too.

cfcs commented Jan 12, 2016

I encounter this occasionally on R2 too.

@taradiddles

This comment has been minimized.

Show comment
Hide comment
@taradiddles

taradiddles Mar 1, 2016

FWIW, tried to debug this as per https://wiki.freedesktop.org/www/Software/systemd/Debugging/#index2h1 ; for some reason /var/lib/xenstored can't be unmounted, the umount process dies with return status 32. See [173.256405] and [173.265964] lines onward in the attached systemd debug output.

shutdown.log.txt

FWIW, tried to debug this as per https://wiki.freedesktop.org/www/Software/systemd/Debugging/#index2h1 ; for some reason /var/lib/xenstored can't be unmounted, the umount process dies with return status 32. See [173.256405] and [173.265964] lines onward in the attached systemd debug output.

shutdown.log.txt

@tasket

This comment has been minimized.

Show comment
Hide comment
@tasket

tasket May 6, 2016

I'm suffering from this problem also. There are times when I've returned to the room hours later to find the laptop stuck on "waiting for 2 jobs" i.e. mounted volumes.

The power management stack ought to have a low-level timeout feature that turns off the power no matter what, once the shutdown process has been started from the UI.

tasket commented May 6, 2016

I'm suffering from this problem also. There are times when I've returned to the room hours later to find the laptop stuck on "waiting for 2 jobs" i.e. mounted volumes.

The power management stack ought to have a low-level timeout feature that turns off the power no matter what, once the shutdown process has been started from the UI.

@lorenzog

This comment has been minimized.

Show comment
Hide comment
@lorenzog

lorenzog May 12, 2016

I think I've isolated this problem - In my case I have a proxy VM I use for VPN, and shutdown fails only when the proxy VM is running at the same time as other VMs that rely on it for networking. Seems like there should be some 'reverse shutdown order' so that dependent VMs are shut down first, and then the proxy VM.

Trying to shut down the proxy VM by hand fails saying 'there are other VMs connected to this VM' so maybe that's what's happening.

Could anyone try to replicate this? I can shut down the system reliably if I manually shut down the VMs in the correct reverse order first.

I think I've isolated this problem - In my case I have a proxy VM I use for VPN, and shutdown fails only when the proxy VM is running at the same time as other VMs that rely on it for networking. Seems like there should be some 'reverse shutdown order' so that dependent VMs are shut down first, and then the proxy VM.

Trying to shut down the proxy VM by hand fails saying 'there are other VMs connected to this VM' so maybe that's what's happening.

Could anyone try to replicate this? I can shut down the system reliably if I manually shut down the VMs in the correct reverse order first.

@cyrinux

This comment has been minimized.

Show comment
Hide comment
@cyrinux

cyrinux May 12, 2016

Hi,
I can confirm you, I have the same problem and same temporary solution.
Regards,

Le jeu. 12 mai 2016 10:33, Lorenzo G. notifications@github.com a écrit :

I think I've isolated this problem - In my case I have a proxy VM I use
for VPN, and shutdown fails only when the proxy VM is running at the same
time as other VMs that rely on it for networking. Seems like there should
be some 'reverse shutdown order' so that dependent VMs are shut down first,
and then the proxy VM.

Trying to shut down the proxy VM by hand fails saying 'there are other VMs
connected to this VM' so maybe that's what's happening.

Could anyone try to replicate this? I can shut down the system reliably if
I manually shut down the VMs in the correct reverse order first.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1581 (comment)

cyrinux commented May 12, 2016

Hi,
I can confirm you, I have the same problem and same temporary solution.
Regards,

Le jeu. 12 mai 2016 10:33, Lorenzo G. notifications@github.com a écrit :

I think I've isolated this problem - In my case I have a proxy VM I use
for VPN, and shutdown fails only when the proxy VM is running at the same
time as other VMs that rely on it for networking. Seems like there should
be some 'reverse shutdown order' so that dependent VMs are shut down first,
and then the proxy VM.

Trying to shut down the proxy VM by hand fails saying 'there are other VMs
connected to this VM' so maybe that's what's happening.

Could anyone try to replicate this? I can shut down the system reliably if
I manually shut down the VMs in the correct reverse order first.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1581 (comment)

@taradiddles

This comment has been minimized.

Show comment
Hide comment
@taradiddles

taradiddles May 12, 2016

Shutting down all VMs indeed solves the problem.
But I don't have a proxyVM, the problem in my case is because sys-net's kernel crashes during shutdown (probably issue #1978.). Either I need to kill it, or wait for a timeout.

Shutting down all VMs indeed solves the problem.
But I don't have a proxyVM, the problem in my case is because sys-net's kernel crashes during shutdown (probably issue #1978.). Either I need to kill it, or wait for a timeout.

@Truthlighting

This comment has been minimized.

Show comment
Hide comment
@Truthlighting

Truthlighting May 31, 2016

I have had this problem also. Haven't tried any workarounds yet.

I have had this problem also. Haven't tried any workarounds yet.

@lorenzog

This comment has been minimized.

Show comment
Hide comment
@lorenzog

lorenzog Jun 1, 2016

A small update - running qvm-shutdown --all works intermittently - if I turn off the VPN first it works more reliably but it still hangs occasionally when trying to shutdown the 'VPN' VM, regarless of the VPN state.

lorenzog commented Jun 1, 2016

A small update - running qvm-shutdown --all works intermittently - if I turn off the VPN first it works more reliably but it still hangs occasionally when trying to shutdown the 'VPN' VM, regarless of the VPN state.

@tasket

This comment has been minimized.

Show comment
Hide comment
@tasket

tasket Jun 1, 2016

@lorenzog adding --force to qvm-shutdown is supposed to shut them all down regardless of network dependency... But even that doesn't work. I get the same hang and logged issue #1826 for it. BTW, I also use a VPN VM extensively.

tasket commented Jun 1, 2016

@lorenzog adding --force to qvm-shutdown is supposed to shut them all down regardless of network dependency... But even that doesn't work. I get the same hang and logged issue #1826 for it. BTW, I also use a VPN VM extensively.

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
@andrewdavidwong

andrewdavidwong Nov 19, 2016

Member

On 2016-11-16 13:27, Loren Rogers wrote:

On 11/16/2016 02:33 PM, Grzesiek Chodzicki wrote:

W dniu środa, 16 listopada 2016 20:04:14 UTC+1 użytkownik Loren Rogers napisał:

Hi all,

I've successfully installed Qubes on my Thinkpad X201 tablet, but it has
issues shutting down. When I explicitly tell it to reboot or shutdown,
it goes through the entire shutdown sequence, but hangs on an empty
black screen. Occasionally, I see an unchanging white underscore (_)
character displayed in the top left when it hangs.

I tried leaving it in this state for about an hour, and no change--I've
always had to force-reset. I assume this is not normal?

Also, I find that the system randomly begins the shutdown sequence on
its own. (And hangs on the black screen at the end.)

Thanks,
Loren
The same issue occurs on my system only if I shut the system down while a VM with a PCI device without FLR support is running

Also, I just confirmed that it shuts down cleanly with all VMs off and no USB devices plugged in.

Member

andrewdavidwong commented Nov 19, 2016

On 2016-11-16 13:27, Loren Rogers wrote:

On 11/16/2016 02:33 PM, Grzesiek Chodzicki wrote:

W dniu środa, 16 listopada 2016 20:04:14 UTC+1 użytkownik Loren Rogers napisał:

Hi all,

I've successfully installed Qubes on my Thinkpad X201 tablet, but it has
issues shutting down. When I explicitly tell it to reboot or shutdown,
it goes through the entire shutdown sequence, but hangs on an empty
black screen. Occasionally, I see an unchanging white underscore (_)
character displayed in the top left when it hangs.

I tried leaving it in this state for about an hour, and no change--I've
always had to force-reset. I assume this is not normal?

Also, I find that the system randomly begins the shutdown sequence on
its own. (And hangs on the black screen at the end.)

Thanks,
Loren
The same issue occurs on my system only if I shut the system down while a VM with a PCI device without FLR support is running

Also, I just confirmed that it shuts down cleanly with all VMs off and no USB devices plugged in.

@saucemcboss

This comment has been minimized.

Show comment
Hide comment
@saucemcboss

saucemcboss Nov 19, 2016

I may backtrack my comment a little--I've since had one instance where it wouldn't shut down even with nothing plugged in. The sys-net VM is connected to a VPN, but I don't believe that this has been correlated with the shutdown issue.

Perhaps on a related note, I believe my machine may be having overheating issues, a known problem with non-Windows distros on the Lenovo X201t. It randomly shuts down completely without warning.

I may backtrack my comment a little--I've since had one instance where it wouldn't shut down even with nothing plugged in. The sys-net VM is connected to a VPN, but I don't believe that this has been correlated with the shutdown issue.

Perhaps on a related note, I believe my machine may be having overheating issues, a known problem with non-Windows distros on the Lenovo X201t. It randomly shuts down completely without warning.

@tasket

This comment has been minimized.

Show comment
Hide comment
@tasket

tasket Nov 19, 2016

I believe the VPN shutdown delay is a consistent problem, but that its separate from the problem of the system not reaching power-off/reset. If I stop the openvpn process in VPN VM before I shutdown, the delay will still occur (whether I'm doing a regular shutdown, or a qvm-shutdown --all).

OTOH, if I have no VMs running a system shutdown can still hang.

With R3.2, the VPN shutdown delay always occurs but the failure to shutdown occurs only around 2% of the time (irregularly).

tasket commented Nov 19, 2016

I believe the VPN shutdown delay is a consistent problem, but that its separate from the problem of the system not reaching power-off/reset. If I stop the openvpn process in VPN VM before I shutdown, the delay will still occur (whether I'm doing a regular shutdown, or a qvm-shutdown --all).

OTOH, if I have no VMs running a system shutdown can still hang.

With R3.2, the VPN shutdown delay always occurs but the failure to shutdown occurs only around 2% of the time (irregularly).

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
@andrewdavidwong

andrewdavidwong Dec 11, 2016

Member

FWIW, I've been using simple shutdown and reboot scripts to work around this issue (I think since I first started using Qubes). Here's an example:

#!/bin/bash
qvm-shutdown --wait --all \
--exclude=sys-net \
--exclude=sys-firewall \
--exclude=sys-usb \
--exclude=sys-whonix \
qvm-shutdown --wait --all
sleep 3
shutdown now

Works every time. The idea is to exclude all service VMs from the first pass of qvm-shutdown so that it doesn't jam up (#1826), so if you use VPN VMs, exclude those too.

Member

andrewdavidwong commented Dec 11, 2016

FWIW, I've been using simple shutdown and reboot scripts to work around this issue (I think since I first started using Qubes). Here's an example:

#!/bin/bash
qvm-shutdown --wait --all \
--exclude=sys-net \
--exclude=sys-firewall \
--exclude=sys-usb \
--exclude=sys-whonix \
qvm-shutdown --wait --all
sleep 3
shutdown now

Works every time. The idea is to exclude all service VMs from the first pass of qvm-shutdown so that it doesn't jam up (#1826), so if you use VPN VMs, exclude those too.

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
@andrewdavidwong

andrewdavidwong Dec 31, 2016

Member

User requests to substitute scripts like the one above (#1581 (comment)) as the default reboot and shutdown scripts here.

Any chance of this happening for 4.0, @marmarek?

(Also see: #1826)

Member

andrewdavidwong commented Dec 31, 2016

User requests to substitute scripts like the one above (#1581 (comment)) as the default reboot and shutdown scripts here.

Any chance of this happening for 4.0, @marmarek?

(Also see: #1826)

@saucemcboss

This comment has been minimized.

Show comment
Hide comment
@saucemcboss

saucemcboss Jan 1, 2017

@andrewdavidwong Your script works great for me - very handy! Just confirming that this works on my machine. It would be great if we could have this as the default shutdown operation.

@andrewdavidwong Your script works great for me - very handy! Just confirming that this works on my machine. It would be great if we could have this as the default shutdown operation.

@saucemcboss

This comment has been minimized.

Show comment
Hide comment
@saucemcboss

saucemcboss Jan 1, 2017

Actually - I've been using a simplified version of this, which has been fine for me:

#!/bin/bash
qvm-shutdown --all  --wait
shutdown now

I've not noticed it jamming up, but it may just be my particular installation setup.

Actually - I've been using a simplified version of this, which has been fine for me:

#!/bin/bash
qvm-shutdown --all  --wait
shutdown now

I've not noticed it jamming up, but it may just be my particular installation setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment