New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon/OSDMonitor:Make the pg_num check more accurate #39062
Conversation
@tchaikov Excuse me ,here is my new PR. |
jenkins test make check |
@tchaikov Thank you for your help. |
jenkins test make check |
adding the DNM label, as the test failure might be related. |
jenkins test make check |
2 similar comments
jenkins test make check |
jenkins test make check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2021-03-09T08:44:40.689 INFO:tasks.ceph.mon.a.smithi025.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-1742-g89fb622b/rpm/el8/BUILD/ceph-17.0.0-1742-g89fb622b/src/osd/OSDMapMapping.h: In function 'mempool::osdmap_mapping::vector<pg_t>& OSDMapMapping::get_osd_acting_pgs(unsigned int)' thread 7f23d5b69700 time 2021-03-09T08:44:40.688560+0000
2021-03-09T08:44:40.690 INFO:tasks.ceph.mon.a.smithi025.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-1742-g89fb622b/rpm/el8/BUILD/ceph-17.0.0-1742-g89fb622b/src/osd/OSDMapMapping.h: 325: FAILED ceph_assert(osd < acting_rmap.size())
2021-03-09T08:44:40.691 INFO:tasks.ceph.mon.a.smithi025.stderr: ceph version 17.0.0-1742-g89fb622b (89fb622b10ca2e58fe0c913a8956377a886e2ab4) quincy (dev)
2021-03-09T08:44:40.691 INFO:tasks.ceph.mon.a.smithi025.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f23e34db1d8]
2021-03-09T08:44:40.692 INFO:tasks.ceph.mon.a.smithi025.stderr: 2: /usr/lib64/ceph/libceph-common.so.2(+0x2763f2) [0x7f23e34db3f2]
2021-03-09T08:44:40.692 INFO:tasks.ceph.mon.a.smithi025.stderr: 3: (OSDMonitor::check_pg_num(long, int, int, int, std::ostream*)+0x4fc) [0x559346ba097c]
2021-03-09T08:44:40.692 INFO:tasks.ceph.mon.a.smithi025.stderr: 4: (OSDMonitor::prepare_new_pool(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int, unsigned int, unsigned long, unsigned long, float, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned long, OSDMonitor::FastReadType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::ostream*)+0x6cd) [0x559346bdfced]
2021-03-09T08:44:40.693 INFO:tasks.ceph.mon.a.smithi025.stderr: 5: (OSDMonitor::prepare_command_impl(boost::intrusive_ptr<MonOpRequest>, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > >, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > > > > const&)+0x1845a) [0x559346c07b5a]
2021-03-09T08:44:40.693 INFO:tasks.ceph.mon.a.smithi025.stderr: 6: (OSDMonitor::prepare_command(boost::intrusive_ptr<MonOpRequest>)+0xf4) [0x559346c13054]
2021-03-09T08:44:40.694 INFO:tasks.ceph.mon.a.smithi025.stderr: 7: (OSDMonitor::prepare_update(boost::intrusive_ptr<MonOpRequest>)+0x373) [0x559346c172c3]
2021-03-09T08:44:40.694 INFO:tasks.ceph.mon.a.smithi025.stderr: 8: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0xa6d) [0x559346b98c2d]
2021-03-09T08:44:40.695 INFO:tasks.ceph.mon.a.smithi025.stderr: 9: (Monitor::handle_command(boost::intrusive_ptr<MonOpRequest>)+0x2794) [0x559346a81c74]
2021-03-09T08:44:40.695 INFO:tasks.ceph.mon.a.smithi025.stderr: 10: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x7f9) [0x559346a864d9]
2021-03-09T08:44:40.696 INFO:tasks.ceph.mon.a.smithi025.stderr: 11: (Monitor::_ms_dispatch(Message*)+0x5f6) [0x559346a87766]
2021-03-09T08:44:40.696 INFO:tasks.ceph.mon.a.smithi025.stderr: 12: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x5c) [0x559346ab5b0c]
2021-03-09T08:44:40.696 INFO:tasks.ceph.mon.a.smithi025.stderr: 13: (DispatchQueue::entry()+0x126a) [0x7f23e37150ea]
2021-03-09T08:44:40.697 INFO:tasks.ceph.mon.a.smithi025.stderr: 14: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f23e37c2ae1]
2021-03-09T08:44:40.697 INFO:tasks.ceph.mon.a.smithi025.stderr: 15: (Thread::_entry_func(void*)+0xd) [0x7f23e35c241d]
2021-03-09T08:44:40.697 INFO:tasks.ceph.mon.a.smithi025.stderr: 16: /lib64/libpthread.so.0(+0x814a) [0x7f23e0fc114a]
2021-03-09T08:44:40.698 INFO:tasks.ceph.mon.a.smithi025.stderr: 17: clone()
45af22d
to
ebe3d59
Compare
@tchaikov All tests have been passed |
@tchaikov excuse me,could you please review this request? |
jenkins test make check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2021-05-01T17:27:24.670 INFO:tasks.workunit.client.0.smithi042.stderr:Error EINVAL: pool size must be between 1 and 10
2021-05-01T17:27:24.672 INFO:tasks.workunit.client.0.smithi042.stderr:+ return 0
2021-05-01T17:27:24.672 INFO:tasks.workunit.client.0.smithi042.stderr:+ ceph osd pool set foo size 3
2021-05-01T17:27:24.941 INFO:tasks.workunit.client.0.smithi042.stderr:Error ERANGE: pool id 52 pg_num 123 size 3 would mean 18446744073709551587 total pgs, which exceeds max 30000 (mon_max_pg_per_osd 10000 * num_in_osds 3)
2021-05-01T17:27:24.942 DEBUG:teuthology.orchestra.run:got remote process result: 34
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
@fyzard1991 hi Jerry, did you manage to fix the test failure? |
@tchaikov Yes, I think this test failed because the conditional judgement of out_osd statistics was wrongly written, and some of the pg were not included in the statistics causing the value of projected to be below the lower limit |
@fyzard1991 could you remove the merge commit in the PR. will try to review your change in this week. |
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
In check_pg_num function, finding the corresponding osd according to the current pool's crush rule, and calculating whether the average value of pg_num on these osd will exceed the value of 'mon_max_pg_per_osd'. Make the pg_num check more accurate by counting all the pgs on the osd used by the new pool. Fixes: https://tracker.ceph.com/issues/47062 Signed-off-by: Jerry Luo <luojierui@chinatelecom.cn>
@tchaikov Thank you for your help |
In check_pg_num function, finding the corresponding osd according to the current pool's crush rule, and calculating whet
her the average value of pg_num on these osd will exceed the value of 'mon_max_pg_per_osd'.Make the pg_num check more accurate by counting all the pgs on the osd used by the new pool.
Fixes: https://tracker.ceph.com/issues/47062
Signed-off-by: Jerry Luo luojierui@chinatelecom.cn
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox