Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crush: detect and (usually) fix ruleset != rule id #13683

Merged
merged 14 commits into from Jun 21, 2017

Conversation

Projects
None yet
4 participants
@liewegas
Copy link
Member

liewegas commented Feb 28, 2017

Transition all CRUSH maps to stop using ruleset. Someday we'll be able to
drop it from CRUSH.

@liewegas liewegas added the common label Feb 28, 2017

@liewegas liewegas requested a review from Mar 1, 2017

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from b44ac96 to 8b5d849 Mar 1, 2017

@ghost ghost added the cleanup label Mar 1, 2017

@ghost

This comment has been minimized.

Copy link

ghost commented Mar 1, 2017

It is intended to be a breaking change ? i.e. existing commands using ruleset must be converted to rule + the output that previous had ruleset as a dictionary key now have rule instead ?

@liewegas

This comment has been minimized.

Copy link
Member Author

liewegas commented Mar 2, 2017

Yes, sort of. The pool set property has allowed crush_rule instead of crush_ruleset for a long time; we'r ejust finally dropping the crush_ruleset variation.

The other changes should only change the output json, replacing crush_ruleset with crush_rule.

It would be nice to just tell consuming apps to change for luminous, but if we really need to we could keep them both for another cycle.. i just worry that it won't really help, since there's no big "this interface is deprecated!" warning we can show a user.

@ghost

This comment has been minimized.

Copy link

ghost commented Mar 2, 2017

Ok then :-) Reviewing with that in mind.

@ghost
Copy link

ghost left a comment

Looking carefully at 5b51acf it looks good. The rest being renaming only, it does not seem to be a source of errors that compilation / tests cannot catch.

I suppose the qa directory has a few tests / tasks that still have "ruleset" in them ( workunits/cephtool/test.sh etc.).

The CrushWrapper functions deserve unit testing.

It otherwise looks good and I can't imagine a use case where it would cause trouble.

*ss << "No suitable CRUSH ruleset exists, check "
if (rule_name == "") {
//Use default rule
*crush_rule = osdmap.crush->get_osd_pool_default_crush_replicated_ruleset(g_ceph_context);

This comment has been minimized.

Copy link
@ghost

ghost Mar 2, 2017

s/ruleset/rule/

This comment has been minimized.

Copy link
@liewegas

liewegas Apr 26, 2017

Author Member

My thinking is to leave most of the crush ruleset stuff alone until it's been long enough that we can remove it entirely. Ceph can be restricted to a 'rule' concept and reject crush maps with rulesets containing more than one rule.

This comment has been minimized.

Copy link
@ghost

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch 3 times, most recently from bd1df9a to 64b1a02 Apr 26, 2017

@liewegas liewegas requested a review from joscollin May 30, 2017

@jdurgin

This comment has been minimized.

Copy link
Member

jdurgin commented May 31, 2017

@joscollin
Copy link
Member

joscollin left a comment

The following directories needs cleanup:

  1. doc/
  2. src/pybind/
  3. src/test/mon/osd-crush.sh
  4. src/tools/crushtool.cc: if(OSDMap::build_simple_crush_rulesets(g_ceph_context, crush, root, &cerr))
By default it is -1, which means the mon will pick the first type
replicated rule in the CRUSH map for replicated pools. Erasure
coded pools have rules that are automatically created for them if they are
not specified at pool creation time.

This comment has been minimized.

Copy link
@joscollin

joscollin Jun 7, 2017

Member

This file ends here. But in the ceph:master, it continues to the section 12.0.2. I don't know why it is not shown as a difference in github. I think this problem comes when there are multiple commits in a PR.

bool CrushWrapper::has_legacy_rulesets() const
{
for (unsigned i=0; i<crush->max_rules; i++) {
crush_rule *r = crush->rules[i];

This comment has been minimized.

Copy link
@joscollin

joscollin Jun 7, 2017

Member

Could you please avoid single character variable names in new additions ? It seems unnecessary comment, but useful when searching the occurrences. Please see: #15279

@@ -2979,7 +2979,7 @@ int OSDMap::build_simple(CephContext *cct, epoch_t e, uuid_d &fsid,
pools[pool].set_flag(pg_pool_t::FLAG_NOSIZECHANGE);
pools[pool].size = cct->_conf->osd_pool_default_size;
pools[pool].min_size = cct->_conf->get_osd_pool_default_min_size();
pools[pool].crush_ruleset = default_replicated_ruleset;
pools[pool].crush_rule = default_replicated_ruleset;

This comment has been minimized.

Copy link
@joscollin

joscollin Jun 7, 2017

Member

ruleset or rule ?

int const default_replicated_ruleset = crush->get_osd_pool_default_crush_replicated_ruleset(cct);

Here too.

@@ -3135,14 +3135,10 @@ int OSDMap::build_simple_crush_rulesets(CephContext *cct,
const string& root,
ostream *ss)
{
int crush_ruleset =
crush._get_osd_pool_default_crush_replicated_ruleset(cct, true);
int crush_ruleset = crush.get_osd_pool_default_crush_replicated_ruleset(cct);

This comment has been minimized.

Copy link
@joscollin

joscollin Jun 7, 2017

Member

ruleset or rule ?

@ghost

ghost approved these changes Jun 7, 2017

@gregsfortytwo

This comment has been minimized.

Copy link
Member

gregsfortytwo commented Jun 14, 2017

http://tracker.ceph.com/issues/17138 Should be closed when this merges, please. :)

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from 64b1a02 to cc003bd Jun 15, 2017

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from cc003bd to 637e92e Jun 15, 2017

@liewegas liewegas added needs-qa and removed wip-sage-testing labels Jun 15, 2017

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from 637e92e to e2bb05e Jun 15, 2017

@liewegas

This comment has been minimized.

Copy link
Member Author

liewegas commented Jun 16, 2017

2017-06-16T04:26:33.600 INFO:tasks.ceph.mgr.x.smithi022.stderr:Error on request:
2017-06-16T04:26:33.601 INFO:tasks.ceph.mgr.x.smithi022.stderr:Traceback (most recent call last):
2017-06-16T04:26:33.601 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 177, in run_wsgi
2017-06-16T04:26:33.601 INFO:tasks.ceph.mgr.x.smithi022.stderr:    execute(self.server.app)
2017-06-16T04:26:33.601 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 165, in execute
2017-06-16T04:26:33.601 INFO:tasks.ceph.mgr.x.smithi022.stderr:    application_iter = app(environ, start_response)
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib/python2.7/site-packages/pecan/middleware/recursive.py", line 56, in __call__
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:    return self.application(environ, start_response)
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 570, in __call__
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:    self.handle_request(req, resp)
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib/python2.7/site-packages/pecan/core.py", line 508, in handle_request
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:    result = controller(*args, **kwargs)
2017-06-16T04:26:33.602 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib64/ceph/mgr/restful/decorators.py", line 33, in decorated
2017-06-16T04:26:33.603 INFO:tasks.ceph.mgr.x.smithi022.stderr:    return f(*args, **kwargs)
2017-06-16T04:26:33.603 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib64/ceph/mgr/restful/api/osd.py", line 130, in get
2017-06-16T04:26:33.603 INFO:tasks.ceph.mgr.x.smithi022.stderr:    return module.instance.get_osds(pool_id)
2017-06-16T04:26:33.603 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib64/ceph/mgr/restful/module.py", line 451, in get_osds
2017-06-16T04:26:33.603 INFO:tasks.ceph.mgr.x.smithi022.stderr:    pools_map = self.get_osd_pools()
2017-06-16T04:26:33.604 INFO:tasks.ceph.mgr.x.smithi022.stderr:  File "/usr/lib64/ceph/mgr/restful/module.py", line 422, in get_osd_pools
2017-06-16T04:26:33.604 INFO:tasks.ceph.mgr.x.smithi022.stderr:    for rule in [r for r in crush_rules if r['rule'] == pool['crush_rule']]:
2017-06-16T04:26:33.604 INFO:tasks.ceph.mgr.x.smithi022.stderr:KeyError: 'rule'

/a/sage-2017-06-16_02:34:35-rados-wip-sage-testing2-distro-basic-smithi/1292990

@liewegas

This comment has been minimized.

Copy link
Member Author

liewegas commented Jun 16, 2017

2017-06-16T05:48:09.108 INFO:tasks.workunit.client.0.smithi059.stdout:GET osd/pool/get.json?pool=rbd&var=crush_rule: json 200
2017-06-16T05:48:09.108 INFO:tasks.workunit.client.0.smithi059.stderr:Traceback (most recent call last):
2017-06-16T05:48:09.108 INFO:tasks.workunit.client.0.smithi059.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/rest/test.py", line 422, in 
2017-06-16T05:48:09.108 INFO:tasks.workunit.client.0.smithi059.stderr:    assert(r.myjson['output']['crush_rule'] == 0)
2017-06-16T05:48:09.108 INFO:tasks.workunit.client.0.smithi059.stderr:AssertionError

/a/sage-2017-06-16_02:34:35-rados-wip-sage-testing2-distro-basic-smithi/1293239

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from e2bb05e to 6f57a01 Jun 16, 2017

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from 6f57a01 to 9912e00 Jun 17, 2017

liewegas added some commits Feb 28, 2017

crush: add rule/set checks, conversion helpers
Add chekcs to see whether the CRUSH map has legacy ruleset != rule id
or (worse yet) multiple rules sharing the same ruleset.

Add a method to convert the former case, and error out on the latter.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: issue health warning if there are multirule rulesets
The user will need to manually fix the CRUSH map if this happens.

Signed-off-by: Sage Weil <sage@redhat.com>
crush/CrushWrapper: optimize find_rule when rules are uniform
On finalize, set a flag indicating whether our rules are uniform.  If so,
find_rule() is trivial--it just verifies the rule exists.

Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: rename 'osd pool create' argument field name
Signed-off-by: Sage Weil <sage@redhat.com>
osd/osd_types: pg_pool_t: rename crush_ruleset -> crush_rule
Signed-off-by: Sage Weil <sage@redhat.com>
mon: remove 'crush_ruleset' property
The 'crush_rule' name has been present since before hammer; users should
use that instead

Signed-off-by: Sage Weil <sage@redhat.com>
mon: scrub 'ruleset' from user-visible output
Signed-off-by: Sage Weil <sage@redhat.com>
mon/OSDMonitor: rename functions and variables, ruleset -> rule
Signed-off-by: Sage Weil <sage@redhat.com>
crush: simplify osd_pool_default_crush_rule config
Make an incompat change here with a release note since
this only affects pool creation, a rare event, and folks
who have customized their configs (also rare).

Keep it simple: a config sets the default rule, or else we pick
the first TYPE_REPLICATED pool in the crush map.

Signed-off-by: Sage Weil <sage@redhat.com>
osd/OSDMap: some cosmetic cleanup (ruleset -> rule)
Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from 9912e00 to c7534cb Jun 19, 2017

osd: replicated_ruleset -> replicated_rule
Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas force-pushed the liewegas:wip-crush-rulesets branch from c7534cb to abad477 Jun 20, 2017

addressed comments

@liewegas liewegas merged commit 1e7cd35 into ceph:master Jun 21, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
arm64 make check arm64 make check succeeded
Details
make check make check succeeded
Details

@liewegas liewegas deleted the liewegas:wip-crush-rulesets branch Jun 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.