Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openstack: Add support for using multiple data volumes #688

Merged

Conversation

meteorfox
Copy link
Contributor

Before this PR, data disks where not working. This PR fixes that and adds the ability to use multiple volumes as well.

  • Fix Pickling error caused by use of generator in class attributes.
  • Improve error handling of disk creation
  • Add switch for enabling verbose http log

@meteorfox
Copy link
Contributor Author

PTAL @kivio @cmccoy

I've tested this branch on OpenStack Kilo.

@@ -34,7 +34,7 @@ def __init__(self, disk_spec, name, zone, image=None):
self.name = name
self.zone = zone
self.device = ""
self.virtual_disks = (c for c in "cdefghijklmnopqrstuvwxyz")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing a PicklingError when the spec was being pickled. The error was caused by the generator.

@meteorfox meteorfox self-assigned this Dec 1, 2015
@cmccoy
Copy link
Contributor

cmccoy commented Dec 2, 2015

Looks good to me.

@meteorfox
Copy link
Contributor Author

@cmccoy Thanks for the CR.

I'll wait until end of day, or until tomorrow for @kivio 's CR, since he developed most of the Openstack bindings; I want to make sure it doesn't break any of his stuff.

@voellm
Copy link
Contributor

voellm commented Dec 2, 2015

FYI we added a "Provider" contact list here -
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Provider-contacts

This should help is getting reviews from the original developers. Feel
free to update.

On Wed, Dec 2, 2015 at 10:52 AM, Carlos Torres notifications@github.com
wrote:

@cmccoy https://github.com/cmccoy Thanks for the CR.

I'll wait until end of day, or until tomorrow for @kivio
https://github.com/kivio 's CR, since he developed most of the
Openstack bindings; I want to make sure it doesn't break any of his stuff.


Reply to this email directly or view it on GitHub
#688 (comment)
.

Anthony F. Voellm (aka Tony)
Google Voice: (650) 516-7382
https://www.google.com/voice/b/0?pli=1#phones
Blog: http://perfguy.blogspot.com/
Benchmark like a pro... PerfKitBenchmarker
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker

@kivio
Copy link
Contributor

kivio commented Dec 3, 2015

@meteorfox looks good for me :) I haven't currently environment to test it :(

@meteorfox
Copy link
Contributor Author

@kivio Thanks for reviewing.

@cmccoy Can I merge?

@ctlaltlaltc
Copy link

@meteorfox , Run error in my env version liberty. I don't know how to fix it yeah.

default_config_constants.yaml:

OpenStack:
machine_type: m1.small
zone: nova
image: ubuntu-pkb
OpenStack:
disk_type: standard
disk_size: 10
mount_point: /scratch

./pkb.py --cloud=OpenStack --benchmarks=fio --noinstall_packages
--openstack_public_network=external_network --openstack_private_network=internal_network

2015-12-04 10:11:56,727 0c7659b8 Thread-1 fio(1/1) ERROR Got exception running FormatDisk: Got non-zero return code (1) executing sudo mke2fs -F -E lazy_itable_init=0 -O ^has_journal -t ext4 -b 4096 /dev/disk/by-id/virtio-ef21dff6-2e72-4a4e-9

mke2fs 1.42.9 (4-Feb-2014)
mke2fs: No such file or directory while trying to determine filesystem size

@meteorfox
Copy link
Contributor Author

@9723 Thanks for testing!

Hmm.. what's your hypervisor backend, is it KVM?

Also, do you consistently hit the error every time you try to run it or is it occasionally?

@ctlaltlaltc
Copy link

@meteorfox , yes, I got the error every time when running fio benchmark and hypervisor backend use KVM.

@meteorfox
Copy link
Contributor Author

@9723 ok, thanks for catching the bug.

I'm investigating the issue, most likely the bug is because it assumes all device paths for attachments start with /dev/disk/by-id/virtio- which might not always be the case. We should instead be using the device_path returned by the attachments property.

@meteorfox meteorfox force-pushed the openstack-multi-volumes branch 3 times, most recently from 0c23199 to 616bb08 Compare December 4, 2015 03:30
@meteorfox
Copy link
Contributor Author

@9723 Can you pull again and try the latest commit and check if it fixed the problem?

Before this last commit, I was able to replicate the issue but not consistently. After this last commit, I ran several tests with fio using 1 and 4 volumes and it has been consistently working for me.

Let me know if it works for you. Thanks!

@meteorfox
Copy link
Contributor Author

I caught another bug. It's not deleting all the volumes when using more than one.

- Fix Pickling error caused by use of generator in class attributes.
- Improve error handling of disk creation
- Add switch for enabling verbose http log
@meteorfox
Copy link
Contributor Author

@kivio Ok, turned out that the exception classes in OpenStack/utils.py were not really catching the errors thrown by the novaclient and it was aborting the thread earlier than expected, making it skip the delete part of the volumes. I couldn't import the exceptions from the novaclient package since those will break lazy loading, so I instead opted for catching the generic Exception.

@9723 Ok, this time should work and also clean up the volumes.

@ctlaltlaltc
Copy link

@meteorfox 👍 LGTM, Thanks. tested in liberty.

@kivio
Copy link
Contributor

kivio commented Dec 4, 2015

@9723 @meteorfox i known this error on OpenStack Juno long since but this is little problematic. First thing, path returned by nova client is just suggestion not sure path. It's described in Nova Client documentation. Second thing when you start more than one volume in one time you can have race condition with /virtio- path. For me best try was getting next /dev/vd(bced..) on each machine. But it still is not that easy. This problem is based on KVM i don't known how other hypervisors on OpenStack works with it.

@voellm
Copy link
Contributor

voellm commented Dec 4, 2015

Should we cherry pick this into v1.0.0? About ready?

On Fri, Dec 4, 2015 at 12:09 AM, Marcin Karkocha notifications@github.com
wrote:

@9723 https://github.com/9723 @meteorfox https://github.com/meteorfox
i known this error on OpenStack Juno long since but this is little
problematic. First thing, path returned by nova client is just suggestion
not sure path. It's described in Nova Client documentation. Second thing
when you start more than one volume in one time you can have race condition
with /virtio- path. For me best try was getting next /dev/vd(bced..) on
each machine. But it still is not that easy. This problem is based on KVM i
don't known how other hypervisors on OpenStack works with it.


Reply to this email directly or view it on GitHub
#688 (comment)
.

Anthony F. Voellm (aka Tony)
Google Voice: (650) 516-7382
https://www.google.com/voice/b/0?pli=1#phones
Blog: http://perfguy.blogspot.com/
Benchmark like a pro... PerfKitBenchmarker
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker

@@ -95,6 +95,13 @@ def Attach(self, vm):
is_unattached = not(volume.status == "in-use"
and volume.attachments)

for attachment in volume.attachments:
if self.attach_id == attachment.get('volume_id'):
self.device = attachment.get('device')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kivio I think using the 'device' property from the attachments list of dict objects we get back from volumes.get() should be OK.

If you noticed at line 85, instead of giving a device hint, I just let OpenStack return the device path it got, then I just retrieve the actual device path once the volume is attached. I believe this is reliable, and will work for other kind of hypervisors.

For later releases of OpenStack, we will need to introduce the python-cinderclient library for volume management since these operations have been deprecated out of the nova client.

@meteorfox
Copy link
Contributor Author

@voellm I'd say yes, it's working for both @9723 and me, but I want to wait for @kivio response since he's the maintainer.

@kivio
Copy link
Contributor

kivio commented Dec 7, 2015

@meteorfox ok, sounds good to me. 👍

@meteorfox
Copy link
Contributor Author

@kivio I think the code addresses those concerns, and it works for Kilo (tested by me) and Liberty (tested by @9723 ). I would like to merge this change since is very low risk, and we have confidence that it works. If that's OK with you

@meteorfox
Copy link
Contributor Author

@kivio Sorry, I didn't see your comment, disregard the comment above.

@voellm @cmccoy OK to merge ?

time.sleep(sleep)
sleep_count += 1
if sleep_count == 10:
sleep = 5
except (os_utils.NotFound, os_utils.BadRequest):
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth including some information on this in the logging statement? This could swallow an exception unrelated to the volume not being found...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmccoy Good point. I can append the exception message to the log.

@cmccoy
Copy link
Contributor

cmccoy commented Dec 7, 2015

@meteorfox: I just added a few questions.

Reduce broadness of exception clause by catching specific exceptions
instead of the broad Exception class.
@meteorfox
Copy link
Contributor Author

@cmccoy: I've narrowed down the exception scope to just the errors it expects to handle. I brought back the imports, directly in os_disk.py and os_virtual_machine.py but I'm lazy loading them. Tested it again and actually found a case were the volume failed to create and was throwing a None Exception that was being 'swallowed'

@cmccoy
Copy link
Contributor

cmccoy commented Dec 8, 2015

@meterfox - Thanks. 👍

@meteorfox
Copy link
Contributor Author

Thanks everyone for the CRs

meteorfox added a commit that referenced this pull request Dec 8, 2015
openstack: Add support for using multiple data volumes
@meteorfox meteorfox merged commit 34daa1c into GoogleCloudPlatform:master Dec 8, 2015
@meteorfox meteorfox deleted the openstack-multi-volumes branch December 8, 2015 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants