New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMs sometimes fail to autostart--maybe a race condition #1075

Closed
nvesely opened this Issue Jul 21, 2015 · 14 comments

Comments

Projects
None yet
4 participants
@nvesely

nvesely commented Jul 21, 2015

On R3-rc1 I have 7 VMs set to autostart altogether and usually only net, firewall, vpn, and one or two others will actually start. The other 2 or 3 will fail. I put a journalctl log of VM autostart service failing individually and in context.

I tried to figure this out myself, but am just learning Python. It seems like the volatile image creation is crashing in the traceback, but the error message is saying a loopdevice is busy. From what I understand, the loop device is a snapshot of the root.img from the TemplateVM. The name is supposed to be connected to the inode number, but it seems like they are just sequential (e.g., this one is loop11). Is the command mkswap -f ${loopdev}p1 > /dev/null in /usr/lib/qubes/prepare-volatile-img.sh what's failing? I read the TemplateImplementation documentation and thought I understood it, but that seems to put me a ways from understanding the actual template implementation....

journalctl of just researchVM startup:

-- Reboot --
Jul 19 21:25:41 dom0 systemd[1]: Starting Start Qubes VM research...
Jul 19 21:25:41 dom0 qvm-start[2442]: --> Creating volatile image: /var/lib/qubes/appvms/research/volatile.img...
Jul 19 21:25:42 dom0 qvm-start[2442]: /dev/loop11p1: Device or resource busy
Jul 19 21:25:42 dom0 qvm-start[2442]: Traceback (most recent call last):
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/bin/qvm-start", line 125, in <module>
Jul 19 21:25:42 dom0 qvm-start[2442]: main()
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/bin/qvm-start", line 109, in main
Jul 19 21:25:42 dom0 qvm-start[2442]: xid = vm.start(verbose=options.verbose, preparing_dvm=options.preparing_dvm, start_guid=not options.noguid, notify_function=tray_notify_generic if options.tray else None)
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py", line 1696, in start
Jul 19 21:25:42 dom0 qvm-start[2442]: self.storage.prepare_for_vm_startup(verbose=verbose)
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/site-packages/qubes/storage/__init__.py", line 196, in prepare_for_vm_startup
Jul 19 21:25:42 dom0 qvm-start[2442]: self.reset_volatile_storage(verbose=verbose)
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/site-packages/qubes/storage/__init__.py", line 193, in reset_volatile_storage
Jul 19 21:25:42 dom0 qvm-start[2442]: self.volatile_img, str(self.root_img_size / 1024 / 1024)])
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
Jul 19 21:25:42 dom0 qvm-start[2442]: raise CalledProcessError(retcode, cmd)
Jul 19 21:25:42 dom0 qvm-start[2442]: subprocess.CalledProcessError: Command '['/usr/lib/qubes/prepare-volatile-img.sh', '/var/lib/qubes/appvms/research/volatile.img', '10240']' returned non-zero exit status 1
Jul 19 21:25:42 dom0 systemd[1]: qubes-vm@research.service: main process exited, code=exited, status=1/FAILURE
Jul 19 21:25:42 dom0 systemd[1]: Failed to start Start Qubes VM research.
Jul 19 21:25:42 dom0 systemd[1]: Unit qubes-vm@research.service entered failed state.

Full journalctl log:

Jul 19 21:25:41 dom0 systemd[1]: Started Qubes NetVM startup.
Jul 19 21:25:41 dom0 systemd[1]: Starting Start Qubes VM media...
Jul 19 21:25:41 dom0 systemd[1]: Starting Start Qubes VM usbvm...
Jul 19 21:25:41 dom0 systemd[1]: Starting Start Qubes VM research...
Jul 19 21:25:41 dom0 systemd[1]: Starting Start Qubes VM vault...
Jul 19 21:25:41 dom0 systemd[1]: Starting Start Qubes VM work...
Jul 19 21:25:41 dom0 qvm-start[2441]: --> Creating volatile image: /var/lib/qubes/appvms/usbvm/volatile.img...
Jul 19 21:25:41 dom0 qvm-start[2444]: --> Creating volatile image: /var/lib/qubes/appvms/work/volatile.img...
Jul 19 21:25:41 dom0 qvm-start[2442]: --> Creating volatile image: /var/lib/qubes/appvms/research/volatile.img...
Jul 19 21:25:41 dom0 qvm-start[2443]: --> Creating volatile image: /var/lib/qubes/appvms/vault/volatile.img...
Jul 19 21:25:41 dom0 qvm-start[2440]: --> Creating volatile image: /var/lib/qubes/appvms/media/volatile.img...
Jul 19 21:25:41 dom0 kernel: loop11: p1 p2
Jul 19 21:25:41 dom0 kernel: loop12: p1 p2
Jul 19 21:25:41 dom0 kernel: loop14: p1 p2
Jul 19 21:25:42 dom0 qvm-start[2442]: /dev/loop11p1: Device or resource busy
Jul 19 21:25:42 dom0 qvm-start[2440]: /dev/loop13p1: No such file or directory
Jul 19 21:25:42 dom0 qvm-start[2442]: Traceback (most recent call last):
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/bin/qvm-start", line 125, in
Jul 19 21:25:42 dom0 qvm-start[2442]: main()
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/bin/qvm-start", line 109, in main
Jul 19 21:25:42 dom0 qvm-start[2442]: xid = vm.start(verbose=options.verbose, preparing_dvm=options.preparing_dvm, start_guid=not options.noguid, notify_function=tray_notify_generic if options.tray else None)
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py", line 1696, in start
Jul 19 21:25:42 dom0 qvm-start[2442]: self.storage.prepare_for_vm_startup(verbose=verbose)
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/site-packages/qubes/storage/init.py", line 196, in prepare_for_vm_startup
Jul 19 21:25:42 dom0 qvm-start[2440]: Traceback (most recent call last):
Jul 19 21:25:42 dom0 qvm-start[2440]: File "/usr/bin/qvm-start", line 125, in
Jul 19 21:25:42 dom0 qvm-start[2440]: main()
Jul 19 21:25:42 dom0 qvm-start[2440]: File "/usr/bin/qvm-start", line 109, in main
Jul 19 21:25:42 dom0 qvm-start[2440]: xid = vm.start(verbose=options.verbose, preparing_dvm=options.preparing_dvm, start_guid=not options.noguid, notify_function=tray_notify_generic if options.tray else None)
Jul 19 21:25:42 dom0 qvm-start[2440]: File "/usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py", line 1696, in start
Jul 19 21:25:42 dom0 qvm-start[2440]: self.storage.prepare_for_vm_startup(verbose=verbose)
Jul 19 21:25:42 dom0 qvm-start[2440]: File "/usr/lib64/python2.7/site-packages/qubes/storage/init.py", line 196, in prepare_for_vm_startup
Jul 19 21:25:42 dom0 qvm-start[2442]: self.reset_volatile_storage(verbose=verbose)
Jul 19 21:25:42 dom0 qvm-start[2440]: self.reset_volatile_storage(verbose=verbose)
Jul 19 21:25:42 dom0 qvm-start[2440]: File "/usr/lib64/python2.7/site-packages/qubes/storage/init.py", line 193, in reset_volatile_storage
Jul 19 21:25:42 dom0 qvm-start[2440]: self.volatile_img, str(self.root_img_size / 1024 / 1024)])
Jul 19 21:25:42 dom0 qvm-start[2440]: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/site-packages/qubes/storage/init.py", line 193, in reset_volatile_storage
Jul 19 21:25:42 dom0 qvm-start[2442]: self.volatile_img, str(self.root_img_size / 1024 / 1024)])
Jul 19 21:25:42 dom0 qvm-start[2442]: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
Jul 19 21:25:42 dom0 qvm-start[2441]: --> Loading the VM (type = AppVM)...
Jul 19 21:25:42 dom0 qvm-start[2442]: raise CalledProcessError(retcode, cmd)
Jul 19 21:25:42 dom0 qvm-start[2442]: subprocess.CalledProcessError: Command '['/usr/lib/qubes/prepare-volatile-img.sh', '/var/lib/qubes/appvms/research/volatile.img', '10240']' returned non-zero exit status 1
Jul 19 21:25:42 dom0 qvm-start[2440]: raise CalledProcessError(retcode, cmd)
Jul 19 21:25:42 dom0 qvm-start[2440]: subprocess.CalledProcessError: Command '['/usr/lib/qubes/prepare-volatile-img.sh', '/var/lib/qubes/appvms/media/volatile.img', '10240']' returned non-zero exit status 1
Jul 19 21:25:42 dom0 qvm-start[2444]: --> Loading the VM (type = AppVM)...
Jul 19 21:25:42 dom0 qvm-start[2443]: --> Loading the VM (type = AppVM)...
Jul 19 21:25:42 dom0 libvirtd[1224]: End of file while reading data: Input/output error
Jul 19 21:25:42 dom0 systemd[1]: qubes-vm@media.service: main process exited, code=exited, status=1/FAILURE
Jul 19 21:25:42 dom0 systemd[1]: Failed to start Start Qubes VM media.
Jul 19 21:25:42 dom0 systemd[1]: Unit qubes-vm@media.service entered failed state.
Jul 19 21:25:42 dom0 libvirtd[1224]: End of file while reading data: Input/output error
Jul 19 21:25:42 dom0 systemd[1]: qubes-vm@research.service: main process exited, code=exited, status=1/FAILURE
Jul 19 21:25:42 dom0 systemd[1]: Failed to start Start Qubes VM research.
Jul 19 21:25:42 dom0 systemd[1]: Unit qubes-vm@research.service entered failed state.

@marmarek marmarek added this to the Release 3.0 milestone Jul 21, 2015

@mig5

This comment has been minimized.

Show comment
Hide comment
@mig5

mig5 Jul 28, 2015

This is a real nightmare when the VM that fails to start is your Yubikey USBVM and you can't get login as a result :)

I reproduce exactly. Any tips on things to try and I can report results. Am pretty good with Python but lack the understanding of the race condition (it does feel like one) and how VMs start up under the hood, to know how to modify.

mig5 commented Jul 28, 2015

This is a real nightmare when the VM that fails to start is your Yubikey USBVM and you can't get login as a result :)

I reproduce exactly. Any tips on things to try and I can report results. Am pretty good with Python but lack the understanding of the race condition (it does feel like one) and how VMs start up under the hood, to know how to modify.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 28, 2015

Member

I'm aware of the issue and also able to reproduce it.
Take a look at /usr/lib/qubes/prepare-volatile-img.sh
(qubes-core-admin/linux/aux-tools/prepare-volatile-img.sh). This
scripts is responsible for creating volatile.img, which consists of two
partitions:

  1. swap (1G)
  2. place for overlay over root.img (same size as root.img) - just empty
    space

This script creates the image, initializes partition table and then
attaches that image as partitioned loop device (loop3p1, loop3p2 etc).
This step is apparently racy - when multiple instances of this script is
called at the same time, sometimes two (or more) instances thinks that
it have the same loop device number (loop3 for example).

I think for now the solution would be to guard this script with some
lock - for example using flock utility.

Later we might move that image preparation to VM's initramfs, so dom0
don't need to know anything about its layout. But it is too late for
such change to go into R3.0.

I'd really appreciate help here with fixing this issue :)

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Jul 28, 2015

I'm aware of the issue and also able to reproduce it.
Take a look at /usr/lib/qubes/prepare-volatile-img.sh
(qubes-core-admin/linux/aux-tools/prepare-volatile-img.sh). This
scripts is responsible for creating volatile.img, which consists of two
partitions:

  1. swap (1G)
  2. place for overlay over root.img (same size as root.img) - just empty
    space

This script creates the image, initializes partition table and then
attaches that image as partitioned loop device (loop3p1, loop3p2 etc).
This step is apparently racy - when multiple instances of this script is
called at the same time, sometimes two (or more) instances thinks that
it have the same loop device number (loop3 for example).

I think for now the solution would be to guard this script with some
lock - for example using flock utility.

Later we might move that image preparation to VM's initramfs, so dom0
don't need to know anything about its layout. But it is too late for
such change to go into R3.0.

I'd really appreciate help here with fixing this issue :)

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@mig5

This comment has been minimized.

Show comment
Hide comment
@mig5

mig5 Jul 28, 2015

I've implemented a flock fix for this, with a simple loop until the lock is free. Pull request is at QubesOS/qubes-core-admin#4 if you wish to implement.

In gist https://gist.github.com/mig5/a5a6f194b8221263d6f9 is an example excerpt of the journalctl. My 3 VMs have started every time, 5 times over, after applying the patch. Look for the message 'Lock is already obtained.. waiting 5 seconds til we try again'

To avoid an infinite loop we set a max counter and just give up if it fails - but it's worked for me every time since, on the second try.

Let me know what you think.

mig5 commented Jul 28, 2015

I've implemented a flock fix for this, with a simple loop until the lock is free. Pull request is at QubesOS/qubes-core-admin#4 if you wish to implement.

In gist https://gist.github.com/mig5/a5a6f194b8221263d6f9 is an example excerpt of the journalctl. My 3 VMs have started every time, 5 times over, after applying the patch. Look for the message 'Lock is already obtained.. waiting 5 seconds til we try again'

To avoid an infinite loop we set a max counter and just give up if it fails - but it's worked for me every time since, on the second try.

Let me know what you think.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 28, 2015

Member

Maybe just "flock" (without -n), instead of a loop?

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Jul 28, 2015

Maybe just "flock" (without -n), instead of a loop?

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@mig5

This comment has been minimized.

Show comment
Hide comment
@mig5

mig5 Jul 28, 2015

My concern was the other VMs would just fail to start if a lock was at all present, hence a loop to try again. We want to keep retrying for a sane amount of time til they succeed.

I'll try it out though.

mig5 commented Jul 28, 2015

My concern was the other VMs would just fail to start if a lock was at all present, hence a loop to try again. We want to keep retrying for a sane amount of time til they succeed.

I'll try it out though.

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 28, 2015

Member

flock should just wait for the other process to release the lock.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Jul 28, 2015

flock should just wait for the other process to release the lock.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@mig5

This comment has been minimized.

Show comment
Hide comment
@mig5

mig5 Jul 30, 2015

Sorry, you are right that flock will wait by default, but I tried different methods and couldn't find a way to get it to work. Maybe because I couldn't work out what step to run it (before which command) and therefore when to release it.

Given up for now, my original way certainly works well, just a bit overkill I guess. I hope it sets someone else on the right path

mig5 commented Jul 30, 2015

Sorry, you are right that flock will wait by default, but I tried different methods and couldn't find a way to get it to work. Maybe because I couldn't work out what step to run it (before which command) and therefore when to release it.

Given up for now, my original way certainly works well, just a bit overkill I guess. I hope it sets someone else on the right path

@adrelanos

This comment has been minimized.

Show comment
Hide comment
@adrelanos

adrelanos Jul 30, 2015

Member

I guess we also want non-zero exit codes in case no lock could be acquired?

"Lock is already obtained" - this sounds positive. Suggested change: "Lock could not be acquired"

Member

adrelanos commented Jul 30, 2015

I guess we also want non-zero exit codes in case no lock could be acquired?

"Lock is already obtained" - this sounds positive. Suggested change: "Lock could not be acquired"

@adrelanos

This comment has been minimized.

Show comment
Hide comment
@adrelanos

adrelanos Jul 30, 2015

Member

Working on a patch. Do you want the following code block inside or outside the flock?

TOTAL_SIZE=$[ $ROOT_SIZE + $SWAP_SIZE + 512 ]
truncate -s ${TOTAL_SIZE}M "$FILENAME"
sfdisk --no-reread -u M "$FILENAME" > /dev/null 2> /dev/null <<EOF
0,${SWAP_SIZE},S
,${ROOT_SIZE},L
EOF
Member

adrelanos commented Jul 30, 2015

Working on a patch. Do you want the following code block inside or outside the flock?

TOTAL_SIZE=$[ $ROOT_SIZE + $SWAP_SIZE + 512 ]
truncate -s ${TOTAL_SIZE}M "$FILENAME"
sfdisk --no-reread -u M "$FILENAME" > /dev/null 2> /dev/null <<EOF
0,${SWAP_SIZE},S
,${ROOT_SIZE},L
EOF

adrelanos added a commit to adrelanos/qubes-core-admin that referenced this issue Jul 30, 2015

@adrelanos adrelanos referenced this issue in QubesOS/qubes-core-admin Jul 30, 2015

Merged

fix VM autostart race condition #5

@adrelanos

This comment has been minimized.

Show comment
Hide comment
@adrelanos

adrelanos Jul 30, 2015

Member

Please check this pull request:
QubesOS/qubes-core-admin#5

Is that, what you had in mind, @marmarek?

I haven't tested this yet, could you have a look and perhaps test @mig5?

Stuff to consider:

  • Will folder /var/run/qubes/ be guaranteed to exist?
  • Will flock be installed? In other words, do you have a package dependency on flock already?
Member

adrelanos commented Jul 30, 2015

Please check this pull request:
QubesOS/qubes-core-admin#5

Is that, what you had in mind, @marmarek?

I haven't tested this yet, could you have a look and perhaps test @mig5?

Stuff to consider:

  • Will folder /var/run/qubes/ be guaranteed to exist?
  • Will flock be installed? In other words, do you have a package dependency on flock already?
@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 30, 2015

Member

On Thu, Jul 30, 2015 at 08:48:11AM -0700, Patrick Schleizer wrote:

Please check this pull request:
QubesOS/qubes-core-admin#5

Is that, what you had in mind, @marmarek?

I think that will do. I'll merge it (as well as other waiting PRs)
somehow tomorrow or this weekend.

I haven't tested this yet, could you have a look and perhaps test @mig5?

Stuff to consider:

  • Will folder /var/run/qubes/ be guaranteed to exist?

Yes, it is created using tmpfiles service.

  • Will flock be installed? In other words, do you have a package dependency on flock already?

It is part of util-linux package, so yes.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Jul 30, 2015

On Thu, Jul 30, 2015 at 08:48:11AM -0700, Patrick Schleizer wrote:

Please check this pull request:
QubesOS/qubes-core-admin#5

Is that, what you had in mind, @marmarek?

I think that will do. I'll merge it (as well as other waiting PRs)
somehow tomorrow or this weekend.

I haven't tested this yet, could you have a look and perhaps test @mig5?

Stuff to consider:

  • Will folder /var/run/qubes/ be guaranteed to exist?

Yes, it is created using tmpfiles service.

  • Will flock be installed? In other words, do you have a package dependency on flock already?

It is part of util-linux package, so yes.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

@mig5

This comment has been minimized.

Show comment
Hide comment
@mig5

mig5 Jul 30, 2015

@adrelanos thanks, the patch works for me.

For safety is it worth using the -w flag to flock so that if something goes wrong with the loop setup, the lock isn't held infinitely (and thus prevent a boot up potentially?) Just an idea, might be fine.

mig5 commented Jul 30, 2015

@adrelanos thanks, the patch works for me.

For safety is it worth using the -w flag to flock so that if something goes wrong with the loop setup, the lock isn't held infinitely (and thus prevent a boot up potentially?) Just an idea, might be fine.

@adrelanos

This comment has been minimized.

Show comment
Hide comment
@adrelanos

adrelanos Jul 31, 2015

Member

I don't know. Something @marmarek knows best. Can do. If @marmarek has any requests, I'll be happy to see if I can do (likely).

Member

adrelanos commented Jul 31, 2015

I don't know. Something @marmarek knows best. Can do. If @marmarek has any requests, I'll be happy to see if I can do (likely).

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Jul 31, 2015

Member

Regarding system startup, systemd already enforces some timeout (AFAIR
90s), so I don't think it is necessary.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Member

marmarek commented Jul 31, 2015

Regarding system startup, systemd already enforces some timeout (AFAIR
90s), so I don't think it is necessary.

Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

marmarek added a commit to marmarek/old-qubes-core-admin that referenced this issue Sep 29, 2015

fix VM autostart race condition
QubesOS/qubes-issues#1075
(cherry picked from commit 472ed54)

Conflicts:
	linux/aux-tools/prepare-volatile-img.sh

In R2 this bug has much smaller impact because of use
clean-volatile.img.tar for PV domains.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment