Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crush: document tunables and rule step set_ #13722

Merged
1 commit merged into from Mar 13, 2017

Conversation

Projects
None yet
2 participants
@ghost
Copy link

ghost commented Mar 1, 2017

No description provided.

@ghost ghost added core documentation labels Mar 1, 2017

@ghost ghost requested a review from liewegas Mar 1, 2017

@ghost ghost changed the title crush: document tunables WIP: crush: document tunables Mar 1, 2017

@ghost ghost changed the title WIP: crush: document tunables DNM: crush: document tunables Mar 1, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 1, 2017

@liewegas after exploring all tunables & the rule steps that can be used to set them, I think only the total_retry tunable + the set_choose_tries step should be documented as useful for the user to modify. The others are for backward compatiblity.

If that looks right to you, I'll turn this into a proper documentation change based on my notes in the description above.

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 1, 2017

@liewegas I got it wrong, please don't review just yet. chooseleaf_tries & choose_tries are both in use, I got confused by the fact that chooseleaf_tries re-uses chooseleaf_descend_once.

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 1, 2017

Here is the updated explanation for the total_retry tunable + the set_choose_tries step + the set_chooseleaf_tries step:

The CHOOSE_FIRSTN and CHOOSE_INDEP rule step look for buckets of a given
type, randomly selecting them. If they are unlucky and find the same
bucket twice, they will try N+1 times (N being the value of the
choose_total_tries tunable). If there is a previous SET_CHOOSE_TRIES
step in the same rule, it will try O times instead (O being the value of
the argument of the SET_CHOOSE_TRIES step).

Note: the choose_total_tries tunable is the number of retry, not the
number of tries. The number of tries is the number of retry + 1. The
SET_CHOOSE_TRIES rule step sets the number of tries and does not need
the + 1. This confusing difference is inherited from an off-by-one bug
from years ago.

The CHOOSELEAF_FIRSTN and CHOOSELEAF_INDEP rule step do the same as
CHOOSE_FIRSTN and CHOOSE_INDEP but also recursively explore each bucket
found, looking for a single device. The same device may be found in two
different buckets because the crush map is not a strict hierarchy, it is
a DAG. When such a collision happens, they will try again. The number of
times they try to find a non colliding device is:

  • If FIRSTN and there is no previous SET_CHOOSELEAF_TRIES rule step: try
    N + 1 times (N being the value of the choose_total_tries tunable)

  • If FIRSTN and there is a previous SET_CHOOSELEAF_TRIES rule step: try
    P times (P being the value of the argument of the SET_CHOOSELEAF_TRIES
    rule step)

  • If INDEP and there is no previous SET_CHOOSELEAF_TRIES rule step: try
    1 time.

  • If INDEP and there is a previous SET_CHOOSELEAF_TRIES rule step: try
    P times (P being the value of the argument of the SET_CHOOSELEAF_TRIES
    rule step)

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Mar 1, 2017

Weird.. the INDEP not respecting set_chooseleaf_tries must have been a bug?

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 1, 2017

Weird.. the INDEP not respecting set_chooseleaf_tries must have been a bug?

it respects set_chooseleaf_tries but defaults to 1 instead of choose_total_tries

it's definitely an inconsistency that could lead to unexpected bad mappings. I guess it never led to any complaint because collisions can only happen if the same device shows in two different buckets.

@ghost ghost changed the title DNM: crush: document tunables crush: document tunables Mar 2, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 2, 2017

@liewegas it is ready for your review when you have time :-)

@ghost ghost changed the title crush: document tunables crush: document tunables and rule step set_ Mar 2, 2017

* a given type, randomly selecting them. If they are unlucky and
* find the same bucket twice, they will try N+1 times (N being the
* value of the choose_total_tries tunable). If there is a previous
* SET_CHOOSE_TRIES step in the same rule, it will try O times

This comment has been minimized.

Copy link
@liewegas

liewegas Mar 2, 2017

Member

choose a letter besides O so that it doesn't look like 0 ?

This comment has been minimized.

Copy link
@ghost

ghost Mar 2, 2017

Author

done

*! reject retry outer descent. Note that this does *not*
*! apply to a collision: in that case we will retry as we used
*! to. */
/* obsoleted in firefly */
__u32 chooseleaf_descend_once;

This comment has been minimized.

Copy link
@liewegas

liewegas Mar 2, 2017

Member

I don't understand what the comment means. The tunables are all here to preserve legacy behavior, so they don't really become obsolete. And we set it to true now..

This comment has been minimized.

Copy link
@ghost

ghost Mar 2, 2017

Author

It was a leftover of a misunderstanding on my part, thanks for catching it.

* this post: "chooseleaf may cause some unnecessary pg
* migrations" in October 2015
* https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg26075.html
* It should always bet set to 1 except for backward compatibility.

This comment has been minimized.

Copy link
@liewegas

liewegas Mar 2, 2017

Member

s/bet/be/

This comment has been minimized.

Copy link
@ghost

ghost Mar 2, 2017

Author

done

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 2, 2017

updated & repushed

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 11, 2017

@liewegas is there anything else you'd like me to change ?

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Mar 13, 2017

the s/bet/be/ typo is still there, but otherwise it's good!

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Mar 13, 2017

the s/bet/be/ typo is still there, but otherwise it's good!

Fixed. I copy pasted the typo 5 times but now I hopefully got all of them :-P

crush: document tunables and rule step set_
Signed-off-by: Loic Dachary <loic@dachary.org>

@ghost ghost merged commit b1a2f9f into ceph:master Mar 13, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.