Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crimson: Enable tcmalloc when using seastar #46062

Merged
merged 2 commits into from Jun 7, 2023

Conversation

markhpc
Copy link
Member

@markhpc markhpc commented Apr 28, 2022

classic-osds have always caused significant memory fragmentation
when using the libc memory allocator due to the way that Ceph
tends to utilize memory. In recent testing, crimson-osd was found
to use 25-27GB of RAM with the stock 3GB bluestore cache settings
(osd_memory_target is only used when tcmalloc is available). Upon
further testing, it was found that the classic OSD is even worse,
using between 32-33GB of RAM after a 5 minute 4K sequential
write test when using libc malloc.

The good news is that it appears that crimson-osd is able to use
tcmalloc for alienstore without significant modification. Better
still, it drastically reduces memory usage. In the same test that
resulted in 25GB RSS memory usage for crimson-osd with libc malloc,
a tcmalloc linked version took around 9GB (with an 8GB
osd_memory_target). Since we do not yet (afaik) expose classic OSD
debugging in crimson it is tough to tell why we are still a little
over, but it's clear that for alienstore we are going to need to
use tcmalloc as we do in classic.

Signed-off-by: Mark Nelson mnelson@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@markhpc
Copy link
Member Author

markhpc commented Apr 29, 2022

src/tcmalloc.cc:332] Attempt to free invalid pointer 0x6040000001d0

Previously this has been caused by libtcmalloc not supporting aligned_alloc:

https://bugzilla.redhat.com/show_bug.cgi?id=1569391
https://tracker.ceph.com/issues/23653

Seeing now if I can reproduce this on my centos stream test setup.

@markhpc
Copy link
Member Author

markhpc commented Apr 29, 2022

These are the tests that failed make check in jenkins due to attempting to free invalid pointers:

projectroot.src.test.crimson.unittest_seastar_alienstore_thread_pool
projectroot.src.test.crimson.seastore.unittest_object_data_handler
projectroot.src.test.crimson.seastore.unittest_collection_manager

Running ninja check locally with libtcmalloc.so.4.5.3, these all appear to pass:

226/249 Test #226: unittest-seastar-alienstore-thread-pool ... Passed 0.11 sec
238/249 Test #238: unittest-object-data-handler .............. Passed 8.76 sec
239/249 Test #239: unittest-collection-manager ............... Passed 4.16 sec

There were a couple of other failures, but I do not know if they are specifically related to this PR:

The following tests FAILED:
33 - smoke.sh (Failed)
172 - mgr-dashboard-smoke.sh (Failed)
187 - safe-to-destroy.sh (Failed)
225 - unittest-seastar-messenger (SEGFAULT)
232 - unittest-seastar-messenger-thrash (Subprocess aborted)

@markhpc
Copy link
Member Author

markhpc commented Apr 29, 2022

jenkins test make check

@rzarzynski
Copy link
Contributor

jenkins retest this please

@rzarzynski
Copy link
Contributor

rzarzynski commented May 1, 2022

The following tests FAILED:
	231 - unittest-seastar-alienstore-thread-pool (Child aborted)
	243 - unittest-object-data-handler (Child aborted)
	244 - unittest-collection-manager (Child aborted)

@rzarzynski
Copy link
Contributor

jenkins retest this please

1 similar comment
@rzarzynski
Copy link
Contributor

jenkins retest this please

src/crimson/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -1,4 +1,4 @@
if(ALLOCATOR STREQUAL "tcmalloc" AND NOT WITH_SEASTAR)
if(ALLOCATOR STREQUAL "tcmalloc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liu-chunmei does the reason why we disabled heap profiler still hold ?

see 028159e .

tchaikov
tchaikov previously approved these changes May 3, 2022
@tchaikov
Copy link
Contributor

tchaikov commented May 3, 2022

FWIW, I just recompiled without this PR applied using: ./do_cmake.sh -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SEASTAR=ON

and neither ceph-osd nor crimson-osd are linked against tcmalloc, but radosgw is.

that's scaring. it deserves a tracker ticket.

@tchaikov tchaikov dismissed their stale review May 3, 2022 01:38

dismissed due to test failure.

@tchaikov
Copy link
Contributor

tchaikov commented May 3, 2022

        Start 231: unittest-seastar-alienstore-thread-pool
179/277 Test #231: unittest-seastar-alienstore-thread-pool ...Child aborted***Exception:   0.02 sec
src/tcmalloc.cc:332] Attempt to free invalid pointer 0x6040000001d0 

we'd need to address this failure first.

@tchaikov
Copy link
Contributor

tchaikov commented May 3, 2022

FWIW, I just recompiled without this PR applied using: ./do_cmake.sh -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_SEASTAR=ON
and neither ceph-osd nor crimson-osd are linked against tcmalloc, but radosgw is.

that's scaring. it deserves a tracker ticket.

should be addressed by #46103

@djgalloway djgalloway changed the base branch from master to main May 25, 2022 20:00
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Feb 28, 2023
@rzarzynski rzarzynski self-requested a review February 28, 2023 22:20
@github-actions github-actions bot removed the stale label Feb 28, 2023
Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to port the Valgrind's suppression rules to ASan:

rzarzynski@teuthology:/a/rzarzynski-2023-03-12_13:57:50-crimson-rados-main-distro-crimson-smithi$ less 7204125/teuthology.log
...
2023-03-12T14:22:11.448 DEBUG:teuthology.orchestra.run.smithi027:> sudo MALLOC_CHECK_=3 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph-osd --no-mon-config --cluster ceph --mkfs --mkkey 
-i 0 --monmap /home/ubuntu/cephtest/ceph.monmap
...
023-03-12T14:22:17.159 INFO:teuthology.orchestra.run.smithi027.stderr:=================================================================
2023-03-12T14:22:17.159 INFO:teuthology.orchestra.run.smithi027.stderr:==104216==ERROR: LeakSanitizer: detected memory leaks
2023-03-12T14:22:17.159 INFO:teuthology.orchestra.run.smithi027.stderr:
2023-03-12T14:22:17.159 INFO:teuthology.orchestra.run.smithi027.stderr:Direct leak of 8 byte(s) in 1 object(s) allocated from:
2023-03-12T14:22:17.159 INFO:teuthology.orchestra.run.smithi027.stderr:    #0 0x7fa2ec22d307 in operator new(unsigned long) (/lib64/libasan.so.6+0xb6307)
2023-03-12T14:22:17.160 INFO:teuthology.orchestra.run.smithi027.stderr:    #1 0x7fa2eb769ddd in InitModule() [clone .part.4] (/lib64/libtcmalloc.so.4+0x2eddd)
2023-03-12T14:22:17.160 INFO:teuthology.orchestra.run.smithi027.stderr:
2023-03-12T14:22:17.160 INFO:teuthology.orchestra.run.smithi027.stderr:SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).
2023-03-12T14:22:17.226 DEBUG:teuthology.orchestra.run:got remote process result: 1
2023-03-12T14:22:17.227 DEBUG:teuthology.orchestra.run.smithi027:> sudo MALLOC_CHECK_=3 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph-osd --cluster ceph --mkfs --mkkey -i 0 --monmap /home/ubuntu/cephtest/ceph.monmap
...
2023-03-12T14:22:18.136 INFO:teuthology.orchestra.run.smithi027.stderr:INFO  2023-03-12 14:22:18,136 [shard 0] ms - [0x61100002c3c0 client.?(temp_mon_client) - >> mon.? v2:172.21.15.27:3300/0] protocol CONNECTING execute_connecting fault, going to WAIT io_stat(io_state=delay, in_seq=0, out_seq=0, out_pending_msgs_size=1, out_sent_msgs_size=0, need_ack=0, need_keepalive=0, need_keepalive_ack=0) -- Connection refused
2023-03-12T14:22:18.136 INFO:teuthology.orchestra.run.smithi027.stderr:WARN  2023-03-12 14:22:18,136 [shard 0] ms - [0x61100002c3c0 client.?(temp_mon_client) - >> mon.? v2:172.21.15.27:3300/0] waiting 0.2 seconds ...
2023-03-12T14:22:18.336 INFO:teuthology.orchestra.run.smithi027.stderr:INFO  2023-03-12 14:22:18,336 [shard 0] ms - [0x61100002c3c0 client.?(temp_mon_client) - >> mon.? v2:172.21.15.27:3300/0] execute_wait(): going to CONNECTING
2023-03-12T14:22:18.337 INFO:teuthology.orchestra.run.smithi027.stderr:INFO  2023-03-12 14:22:18,337 [shard 0] ms - [0x61100002c3c0 client.?(temp_mon_client) - >> mon.? v2:172.21.15.27:3300/0] protocol CONNECTING execute_connecting fault, going to WAIT io_stat(io_state=delay, in_seq=0, out_seq=0, out_pending_msgs_size=1, out_sent_msgs_size=0, need_ack=0, need_keepalive=0, need_keepalive_ack=0) -- Connection refused
2023-03-12T14:22:18.337 INFO:teuthology.orchestra.run.smithi027.stderr:WARN  2023-03-12 14:22:18,337 [shard 0] ms - [0x61100002c3c0 client.?(temp_mon_client) - >> mon.? v2:172.21.15.27:3300/0] waiting 0.4 seconds ...
...
2023-03-13T02:06:06.399 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2023-03-13T02:06:06.401 DEBUG:teuthology.task.console_log:Killing console logger for smithi027
2023-03-13T02:06:06.402 DEBUG:teuthology.task.console_log:Killing console logger for smithi139

Otherwise --mkfs won't create object store data turning entire crimson-rados runs red: http://pulpito.front.sepia.ceph.com:80/rzarzynski-2023-03-12_13:57:50-crimson-rados-main-distro-crimson-smithi/.

@cbodley
Copy link
Contributor

cbodley commented Mar 16, 2023

fyi, i recently added a qa/lsan.supp file for leak suppressions in #50385

@rzarzynski
Copy link
Contributor

To avoid touching teuthology.git for the sake of suppression file distribution among nodes, I'm working on ceph/ceph-ci@44b621c.

@rzarzynski
Copy link
Contributor

Perhaps we could unify (thinking about generating .cc from the .supp).

@rzarzynski
Copy link
Contributor

This PR needs #50598 as a dependency.

@athanatos athanatos self-requested a review March 20, 2023 20:33
@markhpc
Copy link
Member Author

markhpc commented Mar 23, 2023

@rzarzynski yeah, let's get #50598 merged. We can combine this if you want or just approve that one first. I'm ok with whatever.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label May 22, 2023
@rzarzynski
Copy link
Contributor

@markhpc: #50598 is in!

@github-actions github-actions bot removed the stale label May 23, 2023
@markhpc
Copy link
Member Author

markhpc commented May 25, 2023

Huzzah! @rzarzynski IS there anything we need for this PR to merge now that #50598 is in? (this PR is blocked on your review)

Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as the (not so explicit) dependency got merged.

@rzarzynski
Copy link
Contributor

CC: @Matan-B.

@Matan-B
Copy link
Contributor

Matan-B commented May 30, 2023

jenkins test make check

Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

New unrelated asan failure - https://tracker.ceph.com/issues/61504.

@athanatos
Copy link
Contributor

jenkins retest this please

@athanatos
Copy link
Contributor

athanatos commented May 30, 2023

Hrm, we're seeing address sanitizer errors in make check:

==2692420==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 45 byte(s) in 1 object(s) allocated from:
    #0 0x5641d4ffaedd in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest-object-data-handler+0x7c01edd) (BuildId: 7aeb00a388469f4de54fedaaceb47fc006ce2906)
    #1 0x5641d50d46de in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x5641d50d456a in std::allocator<char>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x5641d50d456a in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    #4 0x5641d50d3fcf in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:153:14
    #5 0x5641d51384f3 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:307:21
    #6 0x5641d5264dba in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:395:8
    #7 0x7fb22a6426f0 in MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId: eeef3d1257388a806e122398dbce3157ee568ef4)

SUMMARY: AddressSanitizer: 45 byte(s) leaked in 1 allocation(s).

Presumably similar to the false positive from above?

@athanatos athanatos self-requested a review May 30, 2023 23:17
Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See asan make check issue above.

@Matan-B
Copy link
Contributor

Matan-B commented May 31, 2023

@rzarzynski had a good hypothesis that it's perfglue that's pulling tcmalloc in. I just recompiled and indeed, we don't need this at all. If we simply remove "AND NOT WITH_SEASTAR" from perfglue's CMakeLists.txt we will link tcmalloc in both crimson and classic OSD when "WITH_SEASTAR=ON" is used.

Since the implicit change to perfglue's CMakeLists.txt result in different allocator selected I pushed #51851 to able to identify which one is being used.

@Matan-B
Copy link
Contributor

Matan-B commented Jun 1, 2023

Hrm, we're seeing address sanitizer errors in make check:

==2692420==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 45 byte(s) in 1 object(s) allocated from:
    #0 0x5641d4ffaedd in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest-object-data-handler+0x7c01edd) (BuildId: 7aeb00a388469f4de54fedaaceb47fc006ce2906)
    #1 0x5641d50d46de in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x5641d50d456a in std::allocator<char>::allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    #3 0x5641d50d456a in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    #4 0x5641d50d3fcf in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:153:14
    #5 0x5641d51384f3 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:307:21
    #6 0x5641d5264dba in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:395:8
    #7 0x7fb22a6426f0 in MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId: eeef3d1257388a806e122398dbce3157ee568ef4)

SUMMARY: AddressSanitizer: 45 byte(s) leaked in 1 allocation(s).

Presumably similar to the false positive from above?

Similar suppression from lsan.supp:

# from perfglue/heap_profiler.cc
# gperftools allocates a singleton and never frees it
leak:^MallocExtension::Register

I think we should add MallocExtension::Initialize() to the embedded suppressions same way #50598 did. I will push a follow up PR.

Edit: Looks like this is gperf's alloc https://github.com/gperftools/gperftools/blob/master/src/tcmalloc.cc#L1146

Matan-B and others added 2 commits June 6, 2023 14:53
…n using seastar"

This reverts commit 380bc6d.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
classic-osds have always caused significant memory fragmentation
when using the libc memory allocator due to the way that Ceph
tends to utilize memory.  In recent testing, crimson-osd was found
to use 25-27GB of RAM with the stock 3GB bluestore cache settings
(osd_memory_target is only used when tcmalloc is available).  Upon
further testing, it was found that the classic OSD is even worse,
using between 32-33GB of RAM after a 5 minute 4K sequential
write test when using libc malloc.

The good news is that it appears that crimson-osd is able to use
tcmalloc for alienstore without significant modification. Better
still, it drastically reduces memory usage.  In the same test that
resulted in 25GB RSS memory usage for crimson-osd with libc malloc,
a tcmalloc linked version took around 9GB (with an 8GB
osd_memory_target).  Since we do not yet (afaik) expose classic OSD
debugging in crimson it is tough to tell why we are still a little
over, but it's clear that for alienstore we are going to need to
use tcmalloc as we do in classic.

Signed-off-by: Mark Nelson <mnelson@redhat.com>
@Matan-B
Copy link
Contributor

Matan-B commented Jun 6, 2023

Rebased and reverted the commit from #51875. (no changes, solely for accurate commit history)

@Matan-B Matan-B requested a review from athanatos June 6, 2023 14:59
@Matan-B Matan-B merged commit 53399f6 into ceph:main Jun 7, 2023
10 of 11 checks passed
@markhpc markhpc deleted the wip-crimson-tcmalloc branch June 8, 2023 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants