New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libvirtError: internal error: libxenlight failed to detach disk 'xvdi' #2094

Open
andrewdavidwong opened this Issue Jun 20, 2016 · 11 comments

Comments

5 participants
@andrewdavidwong
Member

andrewdavidwong commented Jun 20, 2016

User reports receiving this error when attempting to detach a device from a VM in Qubes Manager.

https://groups.google.com/d/msgid/qubes-users/897c5386-9d69-4852-a7dd-47ff8a02672d%40googlegroups.com

@andrewdavidwong

This comment has been minimized.

Show comment
Hide comment
Member

andrewdavidwong commented Jul 31, 2016

@bisam

This comment has been minimized.

Show comment
Hide comment
@bisam

bisam Sep 8, 2016

I get exactly this error when I unplug some USB drive without detaching it first.

What I did:

  1. Plugged in usb3 thumbdrive (on it was some linux live distro)
  2. attached it to appVM (arch-template)
  3. formated (ext4)
  4. mounted it, tried to write some files on it, but failed because of permissions (probably not connected?), then unmounted it
  5. ripped out thumbdrive
  6. plugged in second usb2 thumbdrive (in same port)
  7. wanted to attach it. Realized that it is already attached (Icon in QubesManager)
  8. Out of curiosity clicked detach (the blockdevice name changed since the last drive and was the actual connected drive)
  9. Houston we have a Problem - with the exact same error message as described above.
  10. Show Details OR Cancle OR closing does not work
  11. After some clicking I can force to close qubes-manager
  12. If I start it again, nothing changed. I can reproduce the exact same behaviour.

In the AppVM there is no block device I can connect to (lsblk does not show anything than the usual)

dmesg output:

[19179.146004] EXT4-fs (xvdi1): mounted filesystem with ordered data mode. Opts: (null) [19232.317043] Setting capacity to 0 [19232.317054] xvdi: detected capacity change from 16013942784 to 0 [19232.344456] vbd vbd-51840: 16 Device in use; refusing to close [19659.998390] block xvdi: device/vbd/51840 was hot-unplugged, 2 stale handles

I am not sure how exactly these things work, but in sys-usb there is also no block device showing up.
sys-usb dmesg output:

[18641.447162] usb 3-2: new SuperSpeed USB device number 2 using xhci_hcd
[18641.458672] usb 3-2: New USB device found, idVendor=0781, idProduct=5580
[18641.458697] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[18641.458716] usb 3-2: Product: Extreme
[18641.458728] usb 3-2: Manufacturer: SanDisk
[18641.458742] usb 3-2: SerialNumber: AA010304130654106445
[18641.459884] usb-storage 3-2:1.0: USB Mass Storage device detected
[18641.460281] scsi host2: usb-storage 3-2:1.0
[18642.462672] scsi 2:0:0:0: Direct-Access SanDisk Extreme 0001 PQ: 0 ANSI: 6
[18642.463425] sd 2:0:0:0: Attached scsi generic sg0 type 0
[18642.463877] sd 2:0:0:0: [sda] 31277232 512-byte logical blocks: (16.0 GB/14.9 GiB)
[18642.464242] sd 2:0:0:0: [sda] Write Protect is off
[18642.464272] sd 2:0:0:0: [sda] Mode Sense: 33 00 00 08
[18642.464721] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[18642.467048] sda: sda1 sda2
[18642.468931] sd 2:0:0:0: [sda] Attached SCSI disk
[18660.842504] xen-blkback: ring-ref 4827, event-channel 85, protocol 1 (x86_64-abi) persistent grants
[19303.716174] usb 3-2: USB disconnect, device number 2
[19303.717139] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[19303.717220] sd 2:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[19303.771018] xen-blkback: VBD Resize: Domid: 6, Device: (8, 0)
[19303.771026] xen-blkback: VBD Resize: new size 0
[19688.464099] usb 2-2: new high-speed USB device number 9 using xhci_hcd
[19688.628829] usb 2-2: New USB device found, idVendor=0781, idProduct=557d
[19688.628850] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[19688.628864] usb 2-2: Product: Cruzer Force
[19688.628873] usb 2-2: Manufacturer: SanDisk
[19688.628882] usb 2-2: SerialNumber: 4C530013461114122220
[19688.631701] usb-storage 2-2:1.0: USB Mass Storage device detected
[19688.632004] scsi host3: usb-storage 2-2:1.0
[19689.635442] scsi 3:0:0:0: Direct-Access SanDisk Cruzer Force 1.27 PQ: 0 ANSI: 6
[19689.636137] sd 3:0:0:0: Attached scsi generic sg0 type 0
[19689.637395] sd 3:0:0:0: [sda] 15633408 512-byte logical blocks: (8.00 GB/7.45 GiB)
[19689.638728] sd 3:0:0:0: [sda] Write Protect is off
[19689.638757] sd 3:0:0:0: [sda] Mode Sense: 43 00 00 00
[19689.639111] sd 3:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[19689.648546] sda: sda1 sda2
[19689.651261] sd 3:0:0:0: [sda] Attached SCSI disk

It is a bit hard to copy stuff out of dom0, so I am typing it: According to lspci my usbcontroller is called: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)

More things I noticed:

  • If I want to shutdown the AppVM, it fails and I have to kill it. Then the "attached block device" icon is still visible, although the vm is shutdown.
  • After the restart of the AppVM the icon is still there but not really attached to the VM. (no signs in any logs or whatsoever) I cannot detach it, but it also does not crash qubes-manager.

bisam commented Sep 8, 2016

I get exactly this error when I unplug some USB drive without detaching it first.

What I did:

  1. Plugged in usb3 thumbdrive (on it was some linux live distro)
  2. attached it to appVM (arch-template)
  3. formated (ext4)
  4. mounted it, tried to write some files on it, but failed because of permissions (probably not connected?), then unmounted it
  5. ripped out thumbdrive
  6. plugged in second usb2 thumbdrive (in same port)
  7. wanted to attach it. Realized that it is already attached (Icon in QubesManager)
  8. Out of curiosity clicked detach (the blockdevice name changed since the last drive and was the actual connected drive)
  9. Houston we have a Problem - with the exact same error message as described above.
  10. Show Details OR Cancle OR closing does not work
  11. After some clicking I can force to close qubes-manager
  12. If I start it again, nothing changed. I can reproduce the exact same behaviour.

In the AppVM there is no block device I can connect to (lsblk does not show anything than the usual)

dmesg output:

[19179.146004] EXT4-fs (xvdi1): mounted filesystem with ordered data mode. Opts: (null) [19232.317043] Setting capacity to 0 [19232.317054] xvdi: detected capacity change from 16013942784 to 0 [19232.344456] vbd vbd-51840: 16 Device in use; refusing to close [19659.998390] block xvdi: device/vbd/51840 was hot-unplugged, 2 stale handles

I am not sure how exactly these things work, but in sys-usb there is also no block device showing up.
sys-usb dmesg output:

[18641.447162] usb 3-2: new SuperSpeed USB device number 2 using xhci_hcd
[18641.458672] usb 3-2: New USB device found, idVendor=0781, idProduct=5580
[18641.458697] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[18641.458716] usb 3-2: Product: Extreme
[18641.458728] usb 3-2: Manufacturer: SanDisk
[18641.458742] usb 3-2: SerialNumber: AA010304130654106445
[18641.459884] usb-storage 3-2:1.0: USB Mass Storage device detected
[18641.460281] scsi host2: usb-storage 3-2:1.0
[18642.462672] scsi 2:0:0:0: Direct-Access SanDisk Extreme 0001 PQ: 0 ANSI: 6
[18642.463425] sd 2:0:0:0: Attached scsi generic sg0 type 0
[18642.463877] sd 2:0:0:0: [sda] 31277232 512-byte logical blocks: (16.0 GB/14.9 GiB)
[18642.464242] sd 2:0:0:0: [sda] Write Protect is off
[18642.464272] sd 2:0:0:0: [sda] Mode Sense: 33 00 00 08
[18642.464721] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[18642.467048] sda: sda1 sda2
[18642.468931] sd 2:0:0:0: [sda] Attached SCSI disk
[18660.842504] xen-blkback: ring-ref 4827, event-channel 85, protocol 1 (x86_64-abi) persistent grants
[19303.716174] usb 3-2: USB disconnect, device number 2
[19303.717139] sd 2:0:0:0: [sda] Synchronizing SCSI cache
[19303.717220] sd 2:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[19303.771018] xen-blkback: VBD Resize: Domid: 6, Device: (8, 0)
[19303.771026] xen-blkback: VBD Resize: new size 0
[19688.464099] usb 2-2: new high-speed USB device number 9 using xhci_hcd
[19688.628829] usb 2-2: New USB device found, idVendor=0781, idProduct=557d
[19688.628850] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[19688.628864] usb 2-2: Product: Cruzer Force
[19688.628873] usb 2-2: Manufacturer: SanDisk
[19688.628882] usb 2-2: SerialNumber: 4C530013461114122220
[19688.631701] usb-storage 2-2:1.0: USB Mass Storage device detected
[19688.632004] scsi host3: usb-storage 2-2:1.0
[19689.635442] scsi 3:0:0:0: Direct-Access SanDisk Cruzer Force 1.27 PQ: 0 ANSI: 6
[19689.636137] sd 3:0:0:0: Attached scsi generic sg0 type 0
[19689.637395] sd 3:0:0:0: [sda] 15633408 512-byte logical blocks: (8.00 GB/7.45 GiB)
[19689.638728] sd 3:0:0:0: [sda] Write Protect is off
[19689.638757] sd 3:0:0:0: [sda] Mode Sense: 43 00 00 00
[19689.639111] sd 3:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[19689.648546] sda: sda1 sda2
[19689.651261] sd 3:0:0:0: [sda] Attached SCSI disk

It is a bit hard to copy stuff out of dom0, so I am typing it: According to lspci my usbcontroller is called: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)

More things I noticed:

  • If I want to shutdown the AppVM, it fails and I have to kill it. Then the "attached block device" icon is still visible, although the vm is shutdown.
  • After the restart of the AppVM the icon is still there but not really attached to the VM. (no signs in any logs or whatsoever) I cannot detach it, but it also does not crash qubes-manager.
@mfc

This comment has been minimized.

Show comment
Hide comment
@mfc

mfc Oct 5, 2016

Member

FYI I run into this bug even when I properly detach a USB drive or SD card from the VM Manager. error from sys-usb dmseg:

[ 2182.998963] traps: xl[693] general protection ip:7f3e6ebefb0d sp:7fff6b2e1ea0 error:0 in libxenlight.so.4.6.0[7f3e6ebd5000+92000]

Fedora 23-based sys-usb, using multiple USB or SD devices, USB 3.0 enabled in BIOS.

The result is that the Qubes VM Manager hard crashes, needs to be restarted. Once restarted, USB/SD attach/detach no longer works: I can successfully attach (in the Manager) but the USB/SD doesn't show up in the VM. When detached, the VM Manager crashes again.

Let us know if there's a better way to troubleshoot this. Having to reboot fresh in order to cleanly attach/detach USB/SD devices is a pain.

Member

mfc commented Oct 5, 2016

FYI I run into this bug even when I properly detach a USB drive or SD card from the VM Manager. error from sys-usb dmseg:

[ 2182.998963] traps: xl[693] general protection ip:7f3e6ebefb0d sp:7fff6b2e1ea0 error:0 in libxenlight.so.4.6.0[7f3e6ebd5000+92000]

Fedora 23-based sys-usb, using multiple USB or SD devices, USB 3.0 enabled in BIOS.

The result is that the Qubes VM Manager hard crashes, needs to be restarted. Once restarted, USB/SD attach/detach no longer works: I can successfully attach (in the Manager) but the USB/SD doesn't show up in the VM. When detached, the VM Manager crashes again.

Let us know if there's a better way to troubleshoot this. Having to reboot fresh in order to cleanly attach/detach USB/SD devices is a pain.

@mfc

This comment has been minimized.

Show comment
Hide comment
@mfc

mfc Oct 7, 2016

Member

I can dependably crash VM Manager by attaching an SD card to a qube, mounting the SD card in the qube, unmounting the SD card in the qube, and then attempting to detach the SD card. In that final action, VM Manager will hard crash.

Member

mfc commented Oct 7, 2016

I can dependably crash VM Manager by attaching an SD card to a qube, mounting the SD card in the qube, unmounting the SD card in the qube, and then attempting to detach the SD card. In that final action, VM Manager will hard crash.

@tonsimple

This comment has been minimized.

Show comment
Hide comment
@tonsimple

tonsimple Feb 2, 2017

Getting a lot of this (or very similar) behavior with thumbdrives.
Gathered logs and posted it in that separate issue of mine, the one mentioned above (decided to close that one, because, like, 90% that it's a duplicate of this one)

Hope my logs will be of use.

Getting a lot of this (or very similar) behavior with thumbdrives.
Gathered logs and posted it in that separate issue of mine, the one mentioned above (decided to close that one, because, like, 90% that it's a duplicate of this one)

Hope my logs will be of use.

@tonsimple

This comment has been minimized.

Show comment
Hide comment
@tonsimple

tonsimple Feb 2, 2017

I think I've found a workaround:
You'll need a dummy VM to be used as "cleaner" for stuck devices.

Shut down USBVM
Shut down AppVM that had the "stuck" block device attached to it

Start USBVM
Start Cleaning-DummyVM
Attach the (formerly stuck) device to Cleaning-DummyVM
Start the AppVM that had the "stuck" block device attached to it
Detach the block device from Cleaning-DummyVM
Shut down Cleaning-DummyVM

It's clumsy as hell, but it allows to get "stuck" USB thumb drives back into order.

I think I've found a workaround:
You'll need a dummy VM to be used as "cleaner" for stuck devices.

Shut down USBVM
Shut down AppVM that had the "stuck" block device attached to it

Start USBVM
Start Cleaning-DummyVM
Attach the (formerly stuck) device to Cleaning-DummyVM
Start the AppVM that had the "stuck" block device attached to it
Detach the block device from Cleaning-DummyVM
Shut down Cleaning-DummyVM

It's clumsy as hell, but it allows to get "stuck" USB thumb drives back into order.

@tonsimple

This comment has been minimized.

Show comment
Hide comment
@tonsimple

tonsimple Feb 2, 2017

@marmarek
thing is, I'm pretty meticulous about dismount-and-detach (I only tried yanking out as an experiment to see what would happen)

Still get this behavior, especially with one particularly naughty thumbdrive and one microsd.

@marmarek
thing is, I'm pretty meticulous about dismount-and-detach (I only tried yanking out as an experiment to see what would happen)

Still get this behavior, especially with one particularly naughty thumbdrive and one microsd.

@tonsimple

This comment has been minimized.

Show comment
Hide comment
@tonsimple

tonsimple Feb 2, 2017

Maybe there's some more useful/in-depth logging I could obtain by making some adjustments, then triggering this behavior with the "naughty" thumb drive?

Maybe there's some more useful/in-depth logging I could obtain by making some adjustments, then triggering this behavior with the "naughty" thumb drive?

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Feb 2, 2017

Member

You can check /var/log/libvirt/libxl/libxl-driver.log, but I suspect there will be only something like "detach failed". Then, there is low-level method to check if libvirt state matches actual attached devices:

  1. Get list of block devices according to libvirt: virsh -c xen:/// domblklist VMNAME
  2. Get list of block devices according to libxl: xl block-list VMNAME

The later command will not output much details besides cryptic numbers (being device major:minor encoded in decimal) and xenstore path/selected keys, but you can at least compare count of them. Note "state" column - "4" means its connected, other values are transient states (if you see it there, probably something gone wrong).

Member

marmarek commented Feb 2, 2017

You can check /var/log/libvirt/libxl/libxl-driver.log, but I suspect there will be only something like "detach failed". Then, there is low-level method to check if libvirt state matches actual attached devices:

  1. Get list of block devices according to libvirt: virsh -c xen:/// domblklist VMNAME
  2. Get list of block devices according to libxl: xl block-list VMNAME

The later command will not output much details besides cryptic numbers (being device major:minor encoded in decimal) and xenstore path/selected keys, but you can at least compare count of them. Note "state" column - "4" means its connected, other values are transient states (if you see it there, probably something gone wrong).

@tonsimple

This comment has been minimized.

Show comment
Hide comment
@tonsimple

tonsimple Feb 3, 2017

@marmarek

Okay, I have managed to get those logs for both "pre-attempted detach" state and "post failed detach" state.

I have selected three usb devices most prone to "failed detach" and attached them one-by-one, mounted one-by-one, grabbed logs, then dismounted all and attempted to detach one-by-one.

Three devices were used to ensure that I can induce the "detach attempt fails spontaneously" behavior on first try.

Also got dmesg from both the "store" VM (to which thumbdrives were being attached) and USB vm (which handles the USB controller)
USB vm's log has lines like "traps: xl[669] general protection ip:7f52b043ed5d sp:7fff60025230 error:0 in libxenlight.so.4.6.0[7f52b0425000+90000]" which sounds spooky.

So, logs:

Dom0 (just in case, grabbed the log for this experiment specifically, so that all logs for this particular case match")

journalctl stuff

new-badusb2.txt

Libxl-driver log from Dom0
libxl-driver-log.txt

Logs from libvirt, all devices attached to STORE-VM and successfully mounted

From StoreVM
storevm-predetach.txt

From USB vm

usbvm-predetach.txt

Logs from libxl, all devices attached to STORE-VM and mounted

From StoreVM
xl-storevm-predetach.txt

From USB vm
xl-usbvm-predetach.txt

Logs from libvirt, after a failed detach attempt

From StoreVM
storevm-POSTdetach.txt

From USB vm
usbvm-POSTdetach.txt

Logs from libxl, after a failed detach attempt

From StoreVM
xl-storevm-POSTdetach.txt

From USB vm
xl-usbvm-POSTdetach.txt

StoreVM dmesg
badusb-storevm-dmesg.txt

USBVM dmesg
usbvm-dmesg.txt

P.S.:
my own comments are preceded and finished with '**'

tonsimple commented Feb 3, 2017

@marmarek

Okay, I have managed to get those logs for both "pre-attempted detach" state and "post failed detach" state.

I have selected three usb devices most prone to "failed detach" and attached them one-by-one, mounted one-by-one, grabbed logs, then dismounted all and attempted to detach one-by-one.

Three devices were used to ensure that I can induce the "detach attempt fails spontaneously" behavior on first try.

Also got dmesg from both the "store" VM (to which thumbdrives were being attached) and USB vm (which handles the USB controller)
USB vm's log has lines like "traps: xl[669] general protection ip:7f52b043ed5d sp:7fff60025230 error:0 in libxenlight.so.4.6.0[7f52b0425000+90000]" which sounds spooky.

So, logs:

Dom0 (just in case, grabbed the log for this experiment specifically, so that all logs for this particular case match")

journalctl stuff

new-badusb2.txt

Libxl-driver log from Dom0
libxl-driver-log.txt

Logs from libvirt, all devices attached to STORE-VM and successfully mounted

From StoreVM
storevm-predetach.txt

From USB vm

usbvm-predetach.txt

Logs from libxl, all devices attached to STORE-VM and mounted

From StoreVM
xl-storevm-predetach.txt

From USB vm
xl-usbvm-predetach.txt

Logs from libvirt, after a failed detach attempt

From StoreVM
storevm-POSTdetach.txt

From USB vm
usbvm-POSTdetach.txt

Logs from libxl, after a failed detach attempt

From StoreVM
xl-storevm-POSTdetach.txt

From USB vm
xl-usbvm-POSTdetach.txt

StoreVM dmesg
badusb-storevm-dmesg.txt

USBVM dmesg
usbvm-dmesg.txt

P.S.:
my own comments are preceded and finished with '**'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment