Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd/PG: revert approx size #18755

Merged
merged 3 commits into from Jan 15, 2018
Merged

Conversation

aclamk
Copy link
Contributor

@aclamk aclamk commented Nov 5, 2017

http://tracker.ceph.com/issues/22654

This pull request is result of binary search on master branch to find why lately I got worse performance from incerta07.

I narrowed it to commit 024b5bc.

Test: 600s FIO/rbd 4K random write.

When enabled:
OSD0 consumed: 1h33:33
OSD1 consumed 1h38:49
FIO/rbd has iops: 29172

When it is reverted:
OSD0 consumed: 56:45
OSD1 consumed: 1h02:53
FIO/rbd has iops: 38344

It is possible that this change was a critical one, yet I am amazed by the scope in performance it caused.

@liewegas liewegas changed the title Wip reverted approx size osd/PG: revert approx size Nov 5, 2017
@xiexingguo
Copy link
Member

xiexingguo commented Nov 6, 2017

@aclamk I'd suggest doing a complete revert of 46bfe36 instead of partially reverting it.

Actually the root cause is lost to me. http://www.cplusplus.com/reference/list/list/size/ says the cost of list.size() is constant now in c++11, then perhaps it is because the compiler is smart enough to don't calculate list.size() at all if there is no consumers? dunno..

@tchaikov
Copy link
Contributor

tchaikov commented Nov 6, 2017

libstdc++'s list::size() is O(1) in C++11.

then perhaps it is because the compiler is smart enough to don't calculate list.size() at all if there is no consumers

i don't follow this. could you expand it?

@xiexingguo
Copy link
Member

i don't follow this. could you expand it?

@tchaikov Sorry, I think the root cause of this significant performance regression is still unclear to me. So probably I am totally wrong about it...

@liewegas
Copy link
Member

liewegas commented Nov 7, 2017

Le's go ahead with reverting this until we understand what's going on?

@liewegas
Copy link
Member

liewegas commented Nov 7, 2017

@xiexingguo what is wrong with this revert? it looks like the whole thing to me?

@xiexingguo
Copy link
Member

what is wrong with this revert? it looks like the whole thing to me?

@liewegas We should revert PrimaryLogPG::maybe_force_recovery() also, see
46bfe36#diff-fb41013d27e932534adb50eb3de2aaa5R714

@xiexingguo
Copy link
Member

Le's go ahead with reverting this until we understand what's going on?

Yeah, fine with me.

@xiexingguo
Copy link
Member

what is wrong with this revert? it looks like the whole thing to me?

@liewegas We should revert PrimaryLogPG::maybe_force_recovery() also, see
46bfe36#diff-fb41013d27e932534adb50eb3de2aaa5R714

If the way we count log size is critical, then it might also has a big impact on performance if pg is currently
degraded/backfilling.

@songbaisen
Copy link

the list.size() is O(n) I prefer some books named <effective c++> and <more effective c++ > also < stl 实现原理> are all good. like below https://book.douban.com/subject/1110934/
😄

@liewegas
Copy link
Member

liewegas commented Nov 8, 2017

maybe for some reason _GLIBCXX_USE_CXX11_ABI isn't defined when we build?

@songbaisen
Copy link

// count the number of nodes
size_t
_M_node_count() const
{ return std::distance(begin(), end()); }

Yes, that is O(n).
see below analyse:

http://www.cnblogs.com/sfwtoms/p/3952890.html

@tchaikov
Copy link
Contributor

tchaikov commented Nov 8, 2017

i think it's an issue of old libstdc++ implementation.

in libstdc++-5.1 and up

      /**  Returns the number of elements in the %list.  */
      size_type
      size() const _GLIBCXX_NOEXCEPT
      { return this->_M_node_count(); }
// ....
#if _GLIBCXX_USE_CXX11_ABI
//...
      // return the stored size
      size_t _M_node_count() const { return *_M_impl._M_node._M_valptr(); }
#else
//...
      // count the number of nodes
      size_t _M_node_count() const
      {
    return _S_distance(_M_impl._M_node._M_next,
               std::__addressof(_M_impl._M_node));
      }
#endif

in libstdc++-4.9

      /**  Returns the number of elements in the %list.  */
      size_type
      size() const _GLIBCXX_NOEXCEPT
      { return std::distance(begin(), end()); }

see gcc-mirror/gcc@60f5701

and rhel 7.4 comes with gcc-c++-4.8.5-16.el7.x86_64. probably we can install devtoolset-7 on our jenkins nodes and testbeds to address this issue? @aclamk @alfredodeza what do you think?

since libstdc++ is being actively updated and its performance is being improved. my wild guess is that this could be a tip of the iceberg. if we can have better performance without any code change in ceph/ceph. (yeah, there is actual work in our CI infrastructure, for instance. ceph/ceph-bulid). probably it's worthy the efforts to use the new devtoolset?

@songbaisen
Copy link

@tchaikov 👍 good!

@tchaikov
Copy link
Contributor

tchaikov commented Nov 8, 2017

@liewegas no, _GLIBCXX_USE_CXX11_ABI is always defined by default.

@liewegas
Copy link
Member

liewegas commented Nov 8, 2017

I'm worried about having a severe performance regression when an old glibc is present. Unless we have a cmake guard that breaks the build in that case or something?

@xiexingguo the approx_size is ugly but it isn't really problematic, right?

@xiexingguo
Copy link
Member

xiexingguo commented Nov 9, 2017

@xiexingguo the approx_size is ugly but it isn't really problematic, right?

size_t num_to_trim = pg_log.get_log().approx_size() - target;
if (num_to_trim < cct->_conf->osd_pg_log_trim_min) {
  return;
}
list<pg_log_entry_t>::const_iterator it = pg_log.get_log().log.begin();
     eversion_t new_trim_to;
     for (size_t i = 0; i < num_to_trim; ++i) {
       new_trim_to = it->version;
       ++it;
       if (new_trim_to > limit) {
 	new_trim_to = limit;
 	dout(10) << "calc_trim_to trimming to min_last_complete_ondisk" << dendl;
 	break;
       }
}

seems we'll always overestimate the num of logs to trim, hence we might be at risk of access-violation while increasing the it without checking against it != pg_log.get_log().log.end()

the chance is rare, though!

@songbaisen
Copy link

songbaisen commented Nov 9, 2017

yes, I agree with sage weil may be someone build the code from source and with the old glibc, who will get bad performance.

tchaikov added a commit to tchaikov/ceph that referenced this pull request Nov 9, 2017
__GLIBCPP__ is used before 3.4.0, and gcc 4.8.0 and up is required to
build c++11 source, hence it's safe to check __GLIBCXX__. for the
versioning of libstdc++, see
https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html#abi.versioning,
for the magic number of 20150422ul see
https://gcc.gnu.org/develop.html#timeline.
for the reason why we want to have this warning, see
ceph#18755 (comment) .

Signed-off-by: Kefu Chai <kchai@redhat.com>
@tchaikov
Copy link
Contributor

tchaikov commented Nov 9, 2017

I'm worried about having a severe performance regression when an old glibc is present. Unless we have a cmake guard that breaks the build in that case or something?

@liewegas see #18837. it's a little bit too cruel to the users w/ old devtools. and more importantly it will break our CI infrastructure where it's still using gcc 4.8 and libstdc++ 4.8. again a recent devtoolset like devtoolset-7 is necessary to fix this issue and probably other unidentified performance issue if we compare the performance between two builds from gcc 7 and gcc 4.8.

tchaikov added a commit to tchaikov/ceph that referenced this pull request Nov 9, 2017
__GLIBCPP__ is used before 3.4.0, and gcc 4.8.0 and up is required to
build c++11 source, hence it's safe to check __GLIBCXX__. for the
versioning of libstdc++, see
https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html#abi.versioning,
for the magic number of 20150422ul see
https://gcc.gnu.org/develop.html#timeline.
for the reason why we want to have this warning, see
ceph#18755 (comment) .

Signed-off-by: Kefu Chai <kchai@redhat.com>
@liewegas
Copy link
Member

liewegas commented Nov 9, 2017

👍 on the warning. We still need to revert though :( since our release builds use those distros.

Hmm, is it possible to add the devtoolset to the BuildDepends?

@liewegas
Copy link
Member

liewegas commented Nov 9, 2017

...because then we could turn that warning into an error. That would be ideal :)

tchaikov added a commit to tchaikov/ceph that referenced this pull request Nov 10, 2017
__GLIBCPP__ is used before 3.4.0, and gcc 4.8.0 and up is required to
build c++11 source, hence it's safe to check __GLIBCXX__. for the
versioning of libstdc++, see
https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html#abi.versioning,
for the magic number of 20150422ul see
https://gcc.gnu.org/develop.html#timeline.
for the reason why we want to have this warning, see
ceph#18755 (comment) .

Signed-off-by: Kefu Chai <kchai@redhat.com>
@tchaikov
Copy link
Contributor

tchaikov commented Dec 5, 2017

@aclamk can we close this PR now that we are using GCC-7 for building on centos-7? and we will fail the build on GCC 4.8. see #19344 .

@liewegas liewegas closed this Dec 21, 2017
@tchaikov tchaikov reopened this Dec 25, 2017
@tchaikov
Copy link
Contributor

tchaikov commented Dec 25, 2017

sorry! @aclamk @liewegas I was wrong. devtoolset-7 disables the new C++ ABI.

$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-7/root/usr --mandir=/opt/rh/devtoolset-7/root/usr/share/man --infodir=/opt/rh/devtoolset-7/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-plugin --with-linker-hash-style=gnu --enable-initfini-array --with-default-libstdcxx-abi=gcc4-compatible --with-isl=/builddir/build/BUILD/gcc-7.2.1-20170829/obj-x86_64-redhat-linux/isl-install --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 7.2.1 20170829 (Red Hat 7.2.1-1) (GCC)

in which, it has --with-default-libstdcxx-abi=gcc4-compatible.

so as you suspected: _GLIBCXX_USE_CXX11_ABI=0

@liewegas
Copy link
Member

Hmm, how do we go about getting the new ABI then? Do we need -8 or something?

@tchaikov
Copy link
Contributor

tchaikov commented Dec 25, 2017

@liewegas, no, I don't think gcc-8 or DTS-8 would help. because GCC has excellent backward compatibility. we just need to find a way to open the pandora box. I will test different -fabi-version=n and -fabi-compat-version=m settings tomorrow, but I am sure -D_GLIBCXX_USE_CXX11_ABI=1 does not work.

@tchaikov tchaikov self-assigned this Dec 25, 2017
@tchaikov
Copy link
Contributor

I am pasting the related discussion at https://www.spinics.net/lists/ceph-devel/msg39412.html:

see
https://git.centos.org/blob/rpms!devtoolset-gcc.git/2e5ef6c7934d4417e095855478736742b35bd0af/SPECS!gcc.spec#L1026
. so,

  • “--with-default-libstdcxx-abi=gcc4-compatible”, the DTS-7's
    libstdc++'s default ABI is the old one, and
  • it does not use dual ABI.

hence what we have is the gcc4 compatible ABI, and nothing more, on centos 7.

unless the SCL guys change their minds, or we swing our own GCC build, we can not get the new ABI.

@tchaikov tchaikov removed their assignment Dec 26, 2017
@tchaikov
Copy link
Contributor

@aclamk could you cherry-pick 46bfe36 also, as suggested by @xiexingguo .

@liewegas
Copy link
Member

liewegas commented Jan 9, 2018

@aclamk @tchaikov Wait.. we need to keep approx_size since we don't have O(1) std::list::size(), right? Which means we should drop that commit and stick with the original 2 commits?

@aclamk
Copy link
Contributor Author

aclamk commented Jan 10, 2018

@tchaikov @liewegas

  1. Deleted 3rd commit.
  2. I think we should leave it until we are sure all implementations have list.size() O(1).

This reverts commit 024b5bc.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
@tchaikov
Copy link
Contributor

@liewegas @aclamk sorry. i mean, we should revert 46bfe36#diff-fb41013d27e932534adb50eb3de2aaa5R714 also.

This reverts commit 46bfe36.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
@aclamk
Copy link
Contributor Author

aclamk commented Jan 10, 2018

@tchaikov
Reverted.

@tchaikov tchaikov merged commit 17dcd13 into ceph:master Jan 15, 2018
@tchaikov
Copy link
Contributor

tchaikov commented Jan 17, 2018

for another list::size() related change, see #19813. unlike this one, it will be backported to luminous.

jdurgin referenced this pull request in ceph/ceph-ci Jul 9, 2018
Signed-off-by: Neha Ojha <nojha@redhat.com>
@tchaikov tchaikov mentioned this pull request Oct 8, 2018
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants