New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak reported out of __cxa_thread_atexit #5931
Comments
I'm seeing very high memory usage on rocksdb when running a process for a week. I am wondering if this is related. Any ideas? Commenting here to bump this... |
I am also seeing the same I found this issue after seeing the same 2 valgrind stacks for The 2 This is on RHEL 7.8 with libstdc++-4.8.5-39 which is currently the latest. ( I need to edit
This is for 6.11.4 but the same 2 valgrind __cxa_thread_atexit "definite lost" stacks appear back to 5.18.0
|
@thatsafunnyname the LRU cache leaking is intended. This can be avoided if you remove this line: https://github.com/facebook/rocksdb/blob/master/tools/db_bench_tool.cc#L2764 . This line is to intentionally leak blocks in the block cache to speed up the program shut down process. I can't reproduce the __cxa_thread_atexit leak in my environment in master. Will try 6.11.4 |
Hmm, I can't reproduce it in 6.11.4 either, Which allocator are you using? We usually allocator in glibc when running valgrind (DISABLE_JEMALLOC=1), as jemalloc doesn't go well with valgrind. |
Thanks for taking a look and trying to reproduce. I had started with the allocator we run with, so jemalloc 5.2.0, but had also tried jemalloc 4.5.0 (valgrind support being dropped in v5) and jemalloc 3.6.0.
I checked db_bench is not using libjemalloc with I will try on an AWS EC2 AL2 host (no jemalloc libs installed) tomorrow, at the moment I am getting link problems to gflags when trying to build 6.11.4 on AL2. |
@thatsafunnyname that's interesting. Thanks for trying it. Let us know what you found. |
Update summary: I ran into ( I did an AL2 on-host build of the latest valgrind (valgrind-3.17.0.GIT) and it had the same valgrind problem (illegal instruction) when RDB was not built with I am going to test with some compilers other than Details of the valgrind error: On a newly started: "Amazon Linux 2" AMI with kernel 4.14.186-146.268.amzn2.x86_64 - amzn2-ami-hvm-2.0.20200722.0-x86_64-gp2 (ami-02354e95b39ca8dec)
I saw the same failure when using an AL2 on-host complied valgrind-3.17.0.GIT. |
Using gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) from devtoolset-4 on the RHEL7 host, valgrind reports the 2 On a RHEL8.2 host on AWS with kernel 4.18.0-193.el8.x86_64 , RHEL-8.2.0_HVM-20200423-x86_64-0-Hourly2-GP2 (ami-098f16afa9edf40be) I had to build RDB with Details for the RHEL8.2 host on AWS EC2.
|
These are the steps to reproduce the As I can not reproduce it on "Amazon Linux 2" or RHEL8, and it only ever seems to be 24 bytes in each of the loss records (per On a RHEL7.7 host on AWS with kernel 3.10.0-1062.1.2.el7.x86_64 , RHEL-7.7_HVM-20190923-x86_64-0-Hourly2-GP2 (ami-029c0fbe456d58bd1) , building RocksDB with
|
Also noticed at https://jira.mariadb.org/browse/MDEV-21788 |
is
Similar issue reported at: cameron314/concurrentqueue#152 and the upstream bug report, which I think this issue can be closed in favour of as "fixed in upstream". https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg551653.html |
We aslo meet this memory leak error with ASAN check: #ASAN_OPTIONS=fast_unwind_on_malloc=true,detect_stack_use_after_return=1,detect_odr_violation=2,detect_container_overflow=1,log_path=stderr,new_delete_type_mismatch=1,alloc_dealloc_mismatch=1,suppressions=/v/asan_conf/asan.supp ./column_family_test --gtest_filter=FormatDef/ColumnFamilyTest.FlushTest/0
Note: Google Test filter = FormatDef/ColumnFamilyTest.FlushTest/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FormatDef/ColumnFamilyTest
[ RUN ] FormatDef/ColumnFamilyTest.FlushTest/0
[ OK ] FormatDef/ColumnFamilyTest.FlushTest/0 (2983 ms)
[----------] 1 test from FormatDef/ColumnFamilyTest (2984 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (2986 ms total)
[ PASSED ] 1 test.
=================================================================
==24675==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7fb5d4d6be9f in operator new(unsigned long, std::nothrow_t const&) (/lib64/libasan.so.5+0x10fe9f)
#1 0x7fb5d1638135 in __cxa_thread_atexit (/lib64/libstdc++.so.6+0xa5135)
SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s). |
When moving from ubuntu to centos based workers, new memory leaks emerged in some tests: ep-engine_ep_unit_tests.RocksFullOrValue/DurabilityWarmupTest: Direct leak of 48 byte(s) in 2 object(s) allocated from: #0 0x6e2e27 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3 #1 0x7efc40b38fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37 SUMMARY: AddressSanitizer: 48 byte(s) leaked in 2 allocation(s). ep_testsuite_checkpoint.value_eviction.rocksdb: Direct leak of 288 byte(s) in 12 object(s) allocated from: #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3 #1 0x7faea5f13fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37 SUMMARY: AddressSanitizer: 288 byte(s) leaked in 12 allocation(s). ep_testsuite_checkpoint.full_eviction.rocksdb: Direct leak of 360 byte(s) in 15 object(s) allocated from: #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3 #1 0x7f3810c11fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37 SUMMARY: AddressSanitizer: 360 byte(s) leaked in 15 allocation(s). This appears to perhaps be an issue with rocksdb (similar errors in facebook/rocksdb#5931) or maybe libstdc++ itself, but given we don't ship rocksdb, this change simply adds __cxa_thread_atexit to the suppressions file. Change-Id: I615822181c54aa7fc3b0c0db00d7b70008323008 Reviewed-on: https://review.couchbase.org/c/tlm/+/196843 Tested-by: Blair Watt <blair.watt@couchbase.com> Reviewed-by: Dave Rigby <daver@couchbase.com>
(Backport to fix Asan issues on neo branch under MB-59356) When moving from ubuntu to centos based workers, new memory leaks emerged in some tests: ep-engine_ep_unit_tests.RocksFullOrValue/DurabilityWarmupTest: Direct leak of 48 byte(s) in 2 object(s) allocated from: #0 0x6e2e27 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3 #1 0x7efc40b38fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37 SUMMARY: AddressSanitizer: 48 byte(s) leaked in 2 allocation(s). ep_testsuite_checkpoint.value_eviction.rocksdb: Direct leak of 288 byte(s) in 12 object(s) allocated from: #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3 #1 0x7faea5f13fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37 SUMMARY: AddressSanitizer: 288 byte(s) leaked in 12 allocation(s). ep_testsuite_checkpoint.full_eviction.rocksdb: Direct leak of 360 byte(s) in 15 object(s) allocated from: #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3 #1 0x7f3810c11fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37 SUMMARY: AddressSanitizer: 360 byte(s) leaked in 15 allocation(s). This appears to perhaps be an issue with rocksdb (similar errors in facebook/rocksdb#5931) or maybe libstdc++ itself, but given we don't ship rocksdb, this change simply adds __cxa_thread_atexit to the suppressions file. Change-Id: I615822181c54aa7fc3b0c0db00d7b70008323008 Reviewed-on: https://review.couchbase.org/c/tlm/+/200473 Well-Formed: Restriction Checker Reviewed-by: Trond Norbye <trond.norbye@couchbase.com> Tested-by: Build Bot <build@couchbase.com> Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>
We upgraded to 6.2.4 recently and have started seeing leaks reported out of __cxa_thread_atexit on linux. These leaks are reproducible with db_bench.
Expected behavior
No leaks reported by Valgrind or AddressSanitizer
Actual behavior
Leaks reported by Valgrind and AddressSanitizer
Steps to reproduce the behavior:
Dockerfile to build rocksdb and valgrind
db_bench command:
leaks reported:
full output
The text was updated successfully, but these errors were encountered: