Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Thrasher: update pgp_num of all expanded pools if not yet #13367

Merged
merged 4 commits into from Feb 13, 2017

Conversation

tchaikov
Copy link
Contributor

…fore waiting for healthy

otherwise, the "ceph health" complains with:

all OSDs are running luminous or later but the 'require_luminous_osds'
osdmap flag is not set

"ceph.restart" task will timeout and fail at seeing this warning.

so we need to set the osdmap flag, after upgrading all OSDs. and call
"ceph.restart" again to see if the cluster is healthy or not.

Signed-off-by: Kefu Chai kchai@redhat.com

@tchaikov
Copy link
Contributor Author

@tchaikov
Copy link
Contributor Author

tchaikov commented Feb 11, 2017

@tchaikov
Copy link
Contributor Author

still has

2017-02-11T12:24:57.507 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172; mon.a has mon_osd_down_out_interval set to 0

@tchaikov tchaikov force-pushed the wip-qa-jewel-x-singleton branch 4 times, most recently from 98d6f20 to b14149d Compare February 12, 2017 13:30
@liewegas
Copy link
Member

#13378 fixes the first part using hte releases/luminous.yaml convention. not sure about the other patches here?

@tchaikov
Copy link
Contributor Author

tchaikov commented Feb 12, 2017

@liewegas i will drop the first commit. the other commits address following warning:

2017-02-11T12:24:57.507 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172; mon.a has mon_osd_down_out_interval set to 0

please see the commit message for more info.

https://github.com/ceph/ceph/pull/13378/files#diff-e1cc51eb4bbd70baf5f2e815a28cdcebR9 addresses the

mon.a has mon_osd_down_out_interval set to 0

warning. while the last three commits do this in another way.

either way, we still need 75fa968, which handles

pool cephfs_data pg_num 182 > pgp_num 172;

@tchaikov tchaikov changed the title qa/suites/rados/upgrade/jewel-x-singleton: "require_luminous_osds" be… test: Thrasher: update pgp_num of all expanded pools if not yet Feb 12, 2017
otherwise wait_until_healthy will fail after timeout as seeing warning
like:

HEALTH_WARN pool cephfs_data pg_num 182 > pgp_num 172

Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
Copy link
Member

@liewegas liewegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. fwiw i avoided the down_out_interval warning by just disabling that warning via a config option in luminous.yaml. this will avoid the problem in general, though!

@tchaikov
Copy link
Contributor Author

@tchaikov tchaikov merged commit 148d488 into ceph:master Feb 13, 2017
@tchaikov tchaikov deleted the wip-qa-jewel-x-singleton branch February 13, 2017 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants