Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimic: osd: Better error message when OSD count is less than osd_pool_default_size #30180

Merged
merged 5 commits into from Oct 3, 2019

Conversation

@smithfarm
Copy link
Contributor

smithfarm commented Sep 5, 2019

NOTE: I intentionally omitted 0483c1c because there is no qa/tasks/ceph.conf.template in mimic. As noted below, in mimic the template file in the teuthology repo is used. Therefore, this PR has a companion PR in teuthology.

backport tracker: https://tracker.ceph.com/issues/40083
companion teuthology PR: ceph/teuthology#1308


backport of #27806
parent tracker: https://tracker.ceph.com/issues/38617

this backport was staged using https://github.com/ceph/ceph/blob/master/src/script/ceph-backport.sh

@smithfarm smithfarm added this to the mimic milestone Sep 5, 2019
@smithfarm smithfarm added the core label Sep 5, 2019
@smithfarm smithfarm requested review from neha-ojha, tchaikov and liewegas Sep 5, 2019
@neha-ojha

This comment has been minimized.

Copy link
Member

neha-ojha commented Sep 5, 2019

@smithfarm 1ab352d added qa/tasks/ceph.conf.template, before that we've been using teuthology/ceph.conf.template in teuthology.git, I think we can just add it there.

@smithfarm

This comment has been minimized.

Copy link
Contributor Author

smithfarm commented Sep 7, 2019

Also, build failure:

/home/jenkins-build/build/workspace/ceph-pull-requests/src/mon/PGMap.cc: In member function 'void PGMap::get_health_checks(CephContext*, const OSDMap&, health_check_map_t*) const':
/home/jenkins-build/build/workspace/ceph-pull-requests/src/mon/PGMap.cc:2573:39: error: request for member 'get_val' in 'cct->CephContext::_conf', which is of pointer type 'md_config_t*' (maybe you meant to use '->' ?)
   auto warn_too_few_osds = cct->_conf.get_val<bool>("mon_warn_on_too_few_osds");
                                       ^~~~~~~
/home/jenkins-build/build/workspace/ceph-pull-requests/src/mon/PGMap.cc:2573:47: error: expected primary-expression before 'bool'
   auto warn_too_few_osds = cct->_conf.get_val<bool>("mon_warn_on_too_few_osds");
                                               ^~~~
/home/jenkins-build/build/workspace/ceph-pull-requests/src/mon/PGMap.cc:2574:43: error: request for member 'get_val' in 'cct->CephContext::_conf', which is of pointer type 'md_config_t*' (maybe you meant to use '->' ?)
   auto osd_pool_default_size = cct->_conf.get_val<uint64_t>("osd_pool_default_size");
                                           ^~~~~~~
/home/jenkins-build/build/workspace/ceph-pull-requests/src/mon/PGMap.cc:2574:59: error: expected primary-expression before '>' token
   auto osd_pool_default_size = cct->_conf.get_val<uint64_t>("osd_pool_default_size");
                                                           ^
src/CMakeFiles/common-objs.dir/build.make:713: recipe for target 'src/CMakeFiles/common-objs.dir/mon/PGMap.cc.o' failed
make[3]: *** [src/CMakeFiles/common-objs.dir/mon/PGMap.cc.o] Error 1
jiahuizeng and others added 3 commits Apr 26, 2019
…t_size

Fixes: http://tracker.ceph.com/issues/38617

Signed-off-by: zjh <jhzeng93@foxmail.com>
(cherry picked from commit 94237d3)

Conflicts:
	doc/rados/operations/health-checks.rst
- trivial
        src/mon/PGMap.cc
- cct->_conf->get_val in mimic
Signed-off-by: zjh <jhzeng93@foxmail.com>
(cherry picked from commit e62cfce)
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 3b74fbc)

Conflicts
        src/mon/PGMap.cc
- cct->_conf->get_val in mimic
@smithfarm smithfarm force-pushed the smithfarm:wip-40083-mimic branch from dcf571e to 72a66e7 Sep 7, 2019
@smithfarm

This comment has been minimized.

Copy link
Contributor Author

smithfarm commented Sep 7, 2019

@neha-ojha I opened ceph/teuthology#1308 to go with this.

liewegas and others added 2 commits Feb 5, 2019
Stopping the osd daemon won't reliably get you HEALTH_WARN or ERR; you have
to make sure it is also marked down.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit dcdca44)
address the regression introduced by e62cfce
in e62cfce, we wanted to test the newly introduced TOO_FEW_OSDS
warning, so we increased the number of OSD to the size of pool, so if
the number of OSD is less than pool size, monitor will send a warning
message.

but we need to bring all OSDs back if we are expecting a healthy
cluster. in this change, all OSDs are resurrect before
`wait_for_health_ok`.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit cdba0f1)
@smithfarm

This comment has been minimized.

Copy link
Contributor Author

smithfarm commented Sep 10, 2019

@neha-ojha @tchaikov Please take another look - I cherry-picked dcdca44 and cdba0f1

@smithfarm

This comment has been minimized.

Copy link
Contributor Author

smithfarm commented Sep 10, 2019

see also luminous backport PR #30298

@yuriw

This comment has been minimized.

Copy link
Contributor

yuriw commented Sep 30, 2019

@yuriw yuriw merged commit 0e01e0d into ceph:mimic Oct 3, 2019
4 checks passed
4 checks passed
Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
@smithfarm smithfarm deleted the smithfarm:wip-40083-mimic branch Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.