Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upGraceful recovery from stale volume state? #4030
Comments
jpouellet
changed the title from
Graceful recovery from stale volume state
to
Graceful recovery from stale volume state?
Jun 23, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment
Hide comment
jpouellet
Jun 23, 2018
Contributor
I have not tracked down how the VM got in that state to begin with, but I suspect even if I tracked down the likely cause this time that there may well be others in the future, and so I believe a graceful recovery mechanism is warranted regardless.
|
I have not tracked down how the VM got in that state to begin with, but I suspect even if I tracked down the likely cause this time that there may well be others in the future, and so I believe a graceful recovery mechanism is warranted regardless. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Show comment
Hide comment
|
Err, I just found QubesOS/qubes-core-admin#203 again. Sorry. |
jpouellet
closed this
Jun 23, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
jpouellet commentedJun 23, 2018
It is possible in practice for a Qubes system to somehow be interrupted such that at a later time VMs exist with vm-foo-volatile and vm-foo-private-{snap,tmp} volumes, but no vm-foo-private volume.
When you go to start such a VM, you get, e.g.:
and the VM does not start.
In order to be able to start it again, I need to do:
and restart qubesd.
This has only happened to me on some non-critical VMs, but apparently it has happened to others on some more important ones which is a rather unfortunate failure mode (especially if you are a typical user not familiar with Qubes internals or lvm).
I believe some kind of graceful automatic recovery from such state is warranted. I haven't been close enough to the code recently to have any strong opinion on what that should look like. Just checking for temporary VM names during boot (when we are guaranteed to not have any VMs running) sounds to me like it should work and be safe, but I'm not sure.