New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful recovery from stale volume state? #4030

Closed
jpouellet opened this Issue Jun 23, 2018 · 2 comments

Comments

Projects
None yet
1 participant
@jpouellet
Contributor

jpouellet commented Jun 23, 2018

It is possible in practice for a Qubes system to somehow be interrupted such that at a later time VMs exist with vm-foo-volatile and vm-foo-private-{snap,tmp} volumes, but no vm-foo-private volume.

When you go to start such a VM, you get, e.g.:

[user@dom0 ~]$ qvm-run -a q-src xterm
volume qubes_dom0/vm-foo-private missing

and the VM does not start.

In order to be able to start it again, I need to do:

sudo lvrename qubes_dom0 vm-foo-private-tmp vm-foo-private
sudo lvremove qubes_dom0/vm-foo-private-snap
sudo lvremove qubes_dom0/vm-foo-volatile

and restart qubesd.

This has only happened to me on some non-critical VMs, but apparently it has happened to others on some more important ones which is a rather unfortunate failure mode (especially if you are a typical user not familiar with Qubes internals or lvm).

I believe some kind of graceful automatic recovery from such state is warranted. I haven't been close enough to the code recently to have any strong opinion on what that should look like. Just checking for temporary VM names during boot (when we are guaranteed to not have any VMs running) sounds to me like it should work and be safe, but I'm not sure.

@jpouellet jpouellet changed the title from Graceful recovery from stale volume state to Graceful recovery from stale volume state? Jun 23, 2018

@jpouellet

This comment has been minimized.

Show comment
Hide comment
@jpouellet

jpouellet Jun 23, 2018

Contributor

I have not tracked down how the VM got in that state to begin with, but I suspect even if I tracked down the likely cause this time that there may well be others in the future, and so I believe a graceful recovery mechanism is warranted regardless.

Contributor

jpouellet commented Jun 23, 2018

I have not tracked down how the VM got in that state to begin with, but I suspect even if I tracked down the likely cause this time that there may well be others in the future, and so I believe a graceful recovery mechanism is warranted regardless.

@jpouellet

This comment has been minimized.

Show comment
Hide comment
@jpouellet

jpouellet Jun 23, 2018

Contributor

Err, I just found QubesOS/qubes-core-admin#203 again. Sorry.

👍

Contributor

jpouellet commented Jun 23, 2018

Err, I just found QubesOS/qubes-core-admin#203 again. Sorry.

👍

@jpouellet jpouellet closed this Jun 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment