Skip to content
This repository has been archived by the owner on Apr 19, 2019. It is now read-only.

cinder LVM volume fails to delete #89

Closed
jcourtois opened this issue Sep 8, 2014 · 16 comments
Closed

cinder LVM volume fails to delete #89

jcourtois opened this issue Sep 8, 2014 · 16 comments

Comments

@jcourtois
Copy link

Trying to delete a cinder volume in IAD, it goes from 'active' to 'deleting' but never makes it to 'deleted'. 12 hours later, looking at lvs and lvdisplay, it seems that the volume staged for deletion has not been deleted and is still sitting there in a suspended state. No stack traces noted.

https://gist.github.com/jcourtois/1470b0e24a14205eb592

@jcourtois
Copy link
Author

Reproduced in Lab 02. Instances are stuck in 'creating' and 'deleting'.

https://gist.github.com/jcourtois/dd165a93f1ac5bd3310e

@cloudnull
Copy link
Member

this issue is related to Issue: #99 and should be resolved in PR: #101.

@jcourtois jcourtois reopened this Sep 20, 2014
@jcourtois
Copy link
Author

Testing latest deployment in IAD lab 1. The suite was cleaning up about 8 volumes very rapidly (and perhaps a minute or two after creating) and it triggered another freezing. :|

Seeing very similar issue, with an additional detail that I don't remember noticing before. If I try to manually delete any of my volumes using lvremove inside the cinder container, I get this:

root@573972-cinder01_cinder_volumes_container-7454dcdb:~# lvremove /dev/mapper/cinder--volumes-volume--73584646--91f4--4651--b3a6--f46ee352fe50
Do you really want to remove and DISCARD active logical volume volume-73584646-91f4-4651-b3a6-f46ee352fe50? [y/n]: y
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  device-mapper: remove ioctl on  failed: Device or resource busy
  Unable to deactivate cinder--volumes-volume--73584646--91f4--4651--b3a6--f46ee352fe50 (252:5)
  Unable to deactivate logical volume "volume-73584646-91f4-4651-b3a6-f46ee352fe50"

Here are some logs from cinder-volumes.

https://gist.github.com/jcourtois/dd49918a88e4d99cb323

@cloudnull
Copy link
Member

Couple questions:

  • Is this a new install, or an upgraded one to the new code base?
  • How long has the volume been in the deleting state?
  • Is or was there a VM attached to the volume?
  • Are there any processes within the container locking the volume? (dd, snapshot, other...)
  • if you restart the container can you delete the volume?

@jcourtois
Copy link
Author

Alright, so the issue did resolve itself; whatever was locking up LVM let go. I added a few more lines to https://gist.github.com/jcourtois/dd49918a88e4d99cb323.
As for your questions:
-This is a new install with latest code branch
-The deleting state for the seven or so volumes affected lasted about 25 minutes, after which they were all deleted within about a 1 minute period of time (about 5-10 seconds per volume to delete)
-These were compute integration tests, so there were probably VMs attached, but I can't say.
-Since the issue resolved, I can no longer say.

@jcourtois
Copy link
Author

Testing is still underway. Since this resolved itself in a reasonable amount of time, I'll close this issue again. If it happens again I'll reopen.

@cloudnull
Copy link
Member

This is likely simply a result of the volume having zeros written over it once the delete is executed. A process that does take time and creates a lock while zero'ing.

Let us know if this crops up again.

@jcourtois jcourtois reopened this Sep 22, 2014
@jcourtois
Copy link
Author

Of course it figures that when I stopped testing for the weekend, my last few cinder volumes would exhibit this behavior. I have 3 volumes that have been "deleting" since Saturday night.

Bonus: cinder-volumes has a stacktrace.

https://gist.github.com/jcourtois/49358546b9e4bdeb9242

@cloudnull
Copy link
Member

Can you execute another delete to the same volume and let us know if it succeeds. It seems that the volume was in a locked state.

@jcourtois
Copy link
Author

Which volume/snapshot and using the cinder api or lvremove?

@jcourtois
Copy link
Author

Root problem? From the kernel logs.

Sep 22 19:31:28 569058-cinder01 kernel: [   12.570914] type=1400 audit(1411414288.192:137): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-openstack" name="/run/cgmanager/fs/none,name=systemd/" pid=6385 comm="cgmanager" fstype="cgroup" srcname="none,name=systemd" flags="rw"

@jcourtois
Copy link
Author

The issue appears to be reproduced again in the lab where we changed the change_profile parameter in /etc/apparmor.d/abstractions/lxc/start-container to 'unconfined'. 🍂

@jcourtois jcourtois added the bug label Sep 25, 2014
@jcourtois jcourtois changed the title cinder volume failing to delete in IAD cinder LVM volume fails to delete Sep 25, 2014
@jcourtois
Copy link
Author

Seeing this again in SAT6. In particular, after taking a snapshot of an LVM volume and deleting the snapshot, deleting the volume results in it getting stuck in the deleting stage.

@mancdaz
Copy link
Contributor

mancdaz commented Oct 16, 2014

@git-harry mentioned that this was a known issue in cinder. @git-harry does the gist above help you tracking down this issue?

@jameswthorne
Copy link

Some additional info: https://gist.github.com/jameswthorne/62453bc79b9a9342acaf

@b3rn4rd0s
Copy link

This is going to be an upstream issue fix and is being tracked here: https://bugs.launchpad.net/cinder/+bug/1191960

@mancdaz @claco

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants