New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon/OSDMonitor: osd crush set-device-class #14039

Merged
merged 7 commits into from Apr 10, 2017

Conversation

Projects
None yet
3 participants
@ghost

ghost commented Mar 19, 2017

@ghost ghost added core feature labels Mar 19, 2017

@ghost ghost changed the title from osd crush set-device-class to DNM: osd crush set-device-class Mar 19, 2017

@ghost ghost requested a review from liewegas Mar 19, 2017

@ghost

This comment has been minimized.

ghost commented Mar 19, 2017

@liewegas this is a draft, if it looks like it is going in the right direction I'll add tests & make it work. The general idea is to rebuild the whole class hierarchy when the device class changes. It's not efficient but it's simple and maybe efficiency is an issue here.

@liewegas liewegas changed the title from DNM: osd crush set-device-class to DNM: mon/OSDMonitor: osd crush set-device-class Mar 20, 2017

@liewegas

Looks good to me! I think the next step (and the one that is going to be a bit tricky) is making the OSDs automatically set their device class on startup. For that, I think we need a similar command like set-device-class that only sets it if it isn't already set to something. That way the OSD can set it to a sane default, and the admin can change it, and it will stick that way.

I'm not sure how we should come up with the automagic classes, though. Probably some function in ObjectStore like get_default_device_class(), and the implementation can look at the is_rotational() property on the data and journal partitions, but we'll need a sane naming convention. Maybe

hdd
ssd
hdd+ssd-journal
hdd+ssd-wal+ssd-db

but the + and - part is confusing to parse. Also, should we/can we distinguish between ssd and nvme (sata/sas vs pci)?

@ghost

This comment has been minimized.

ghost commented Mar 20, 2017

I think the next step (and the one that is going to be a bit tricky) is making the OSDs automatically set their device class on startup.

Yes. I chose to implement set-device-class first because we'll need it for what's to come.

I'm not sure how we should come up with the automagic classes, though.

Maybe we could first implement the ceph-disk explicit --class argument suggested in the mailing list thread. And after that implement optional auto-detection to reduce the manual steps. What do you think ?

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 20, 2017

sounds good!

@ghost

This comment has been minimized.

ghost commented Mar 20, 2017

cleanup, added tests and fixing border cases

@ghost ghost changed the title from DNM: mon/OSDMonitor: osd crush set-device-class to mon/OSDMonitor: osd crush set-device-class Mar 23, 2017

@ghost

This comment has been minimized.

ghost commented Mar 23, 2017

@liewegas I think this is ready for a final review

if (class_id_count < 2) {
ldout(cct, 0) << "update_device_class " << name << " id " << id << " is the last device "
<< "with class " << class_name[old_class_id] << ", manually edit the crushmap "
<< "to remove the class entirely" << dendl;

This comment has been minimized.

@liewegas

liewegas Mar 23, 2017

Member

This is the only bit that worries me. If i set a class wrong (e.g. typo) i can't fix it.

Is the concern that you'll end up with an empty device tree? (This seems fine to me.) Or no device tree at all?

I'm forgetting how the tree generation works, but if we generate trees for any device class that is (1) used by an osd or (2) referenced by a rule that would prevent bad things from happening I think?

This comment has been minimized.

@ghost

ghost Mar 23, 2017

This is the only bit that worries me. If i set a class wrong (e.g. typo) i can't fix it.

set-device-class will not create a new class, it will only set the class of a device if it already exists

Is the concern that you'll end up with an empty device tree? (This seems fine to me.) Or no device tree at all?

If this command implies creation / deletion of classes, it will have to handle all the associated border cases. If it can only be used to set the class of a device when the class already exists and when doing so does not obsolete the class entirely, it is much simpler. For instance, removing the last device of a class means the associated class bucket tree is removed permanently and the rules that use it point to a non existent tree and the pool using this rule no longer reference something that exists.

From a sysadmin point of view it does not seem unreasonable to require that some operations such as creating classes, removing classes are done by editing the crushmap. And only operations that are required for daily cluster management are implemented via the cli (such as setting or changing the class of a given device).

I did not think this through though and maybe the cli should allow the full range of operations so that the admin is never required to edit the crushmap.

What do you think ?

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 23, 2017

@ghost

This comment has been minimized.

ghost commented Mar 23, 2017

Ok. Adding ceph osd crush class... to this series makes sense then. If the class can exist independently from the devices, I'm inclined to add "class XXXX" in the crushmap.txt format. So that an empty tree referenced by a rule can be compiled/decompiled in this way.

@liewegas

This comment has been minimized.

Member

liewegas commented Mar 23, 2017

@ghost

This comment has been minimized.

ghost commented Mar 23, 2017

Yes. I'll draft some code so we can discuss this further.

@ghost ghost dismissed liewegas’s stale review Mar 23, 2017

It needs more work after all

@ghost ghost changed the title from mon/OSDMonitor: osd crush set-device-class to DNM: mon/OSDMonitor: osd crush set-device-class Mar 23, 2017

ldachary added some commits Mar 19, 2017

crush: CrushWrapper::trim_roots_with_class removed unused variable
Signed-off-by: Loic Dachary <loic@dachary.org>
crush: implement rebuild_roots_with_classes
- remove all class buckets (even those that are in use)

- rebuild all class buckets (this will restore the buckets that were in
  use with exactly the same id)

- remove the class buckets that are not used

Signed-off-by: Loic Dachary <loic@dachary.org>
crush: CrushWrapper s/remove_unused_root/remove_root/
The unused argument allows removal of class buckets that are in use.

Signed-off-by: Loic Dachary <loic@dachary.org>
@ghost

This comment has been minimized.

ghost commented Apr 3, 2017

@liewegas you're right, there is no harm in having an empty tree. I just removed the check that failed if the device was the last of its class. Together with the ls/create/rm class it should be enough to maintain the classes in all cases.

@ghost ghost changed the title from DNM: mon/OSDMonitor: osd crush set-device-class to mon/OSDMonitor: osd crush set-device-class Apr 3, 2017

@@ -1230,6 +1230,32 @@ int CrushWrapper::remove_rule(int ruleno)
return 0;
}
int CrushWrapper::update_device_class(CephContext *cct, int id, const string& class_name, const string& name)

This comment has been minimized.

@liewegas

liewegas Apr 4, 2017

Member

commit message on this commit is stale

@liewegas

commit message nit aside, looks good!

@liewegas liewegas added the needs-qa label Apr 4, 2017

ldachary added some commits Mar 20, 2017

crush: implement update_device_class
Change the device class of a given device. The class must already exist.

Signed-off-by: Loic Dachary <loic@dachary.org>
mon: osd crush set-device-class
Signed-off-by: Loic Dachary <loic@dachary.org>
crush: implement class_is_in_use/remove_class_name
Signed-off-by: Loic Dachary <loic@dachary.org>
mon: implement osd crush class create/rm/ls
Signed-off-by: Loic Dachary <loic@dachary.org>
@ghost

This comment has been minimized.

ghost commented Apr 5, 2017

@liewegas ooops, updated the commit message and repushed, thanks for catching that.

@yuriw yuriw merged commit 937adde into ceph:master Apr 10, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment