New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon/OSDMonitor: fix improper input/testing range of crush somke testing #17179
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm modulo the nit.
src/mon/OSDMonitor.cc
Outdated
tester.set_max_x(50); | ||
auto start = ceph_clock_now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to use ceph::coarse_mono_clock::now()
for this purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repushed. Thanks, kefu.
this might help with the smoke test timeout issue http://tracker.ceph.com/issues/20909. so marking it "bug-fix" and "needs-backport". |
337884e
to
5f384b6
Compare
if (r < 0) { | ||
dout(10) << " tester.test_with_fork returns " << r | ||
<< ": " << ess.str() << dendl; | ||
ss << "crush smoke test failed with " << r << ": " << ess.str(); | ||
err = r; | ||
goto reply; | ||
} | ||
dout(10) << " crush test result " << ess.str() << dendl; | ||
dout(10) << __func__ << " crush somke test duration: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we dump ess.str()
like before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not output anything if the test succeeds, but it does no harm to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, cool.
9c11f57
to
e128a1e
Compare
CrushTester::test() will reset testing range to [0, 1023] whenever min_x or max_x is negative and the constructor of CrushTester will always default min_x and max_x to -1. Thus to set the test range correctly, you have to specify both min_x and max_x. Local test shows this patch shall decrease the time consumed by the crush smoke testing to approximate 1/20 of those without this. For exmaple: crush somke test duration: 0.668354 seconds -> crush somke test duration: 0.012592 seconds Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
fwiw this failed in luminous backport test: /a/sage-2017-08-24_17:38:40-rados-wip-sage-testing2-luminous-20170824a-distro-basic-smithi/1560312 2017-08-24T19:29:45.785 INFO:tasks.workunit.client.0.smithi017.stderr:+ ceph osd pool create fooo 123 123 erasure default 2017-08-24T19:29:45.787 INFO:tasks.workunit.client.0.smithi017.stderr:2017-08-24 19:29:45.073532 7f66b4337700 -1 WARNING: all dangerous and experimental features are enabled. 2017-08-24T19:29:45.807 INFO:tasks.workunit.client.0.smithi017.stderr:2017-08-24 19:29:45.143495 7f66b4337700 -1 WARNING: all dangerous and experimental features are enabled. 2017-08-24T19:29:50.489 INFO:tasks.ceph.mon.a.smithi017.stderr:: timed out (5 sec) 2017-08-24T19:29:53.568 INFO:tasks.workunit.client.0.smithi017.stderr:Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds) |
CrushTester::test() will reset testing range to [0, 1023] whenever
min_x or max_x is negative and the constructor of CrushTester will
always default min_x and max_x to -1.
Thus to set the test range correctly, you have to specify both min_x and max_x.
Local test shows this patch shall decrease the time consumed by the crush
smoke testing to approximate 1/20 of those without this.
For example:
crush somke test duration: 0.668354 seconds ->
crush somke test duration: 0.012592 seconds
Signed-off-by: xie xingguo xie.xingguo@zte.com.cn