Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak reported out of __cxa_thread_atexit #5931

Open
shoda-tibco opened this issue Oct 16, 2019 · 12 comments
Open

Memory leak reported out of __cxa_thread_atexit #5931

shoda-tibco opened this issue Oct 16, 2019 · 12 comments

Comments

@shoda-tibco
Copy link
Contributor

We upgraded to 6.2.4 recently and have started seeing leaks reported out of __cxa_thread_atexit on linux. These leaks are reproducible with db_bench.

==1== 24 bytes in 1 blocks are definitely lost in loss record 536 of 1,353
==1==    at 0x4C29680: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:385)
==1==    by 0x7A5115: __cxa_thread_atexit (in /rocksdb-host/db_bench)
==1==    by 0x5FD8E8: UnknownInlinedFun (instrumented_mutex.cc:69)
==1==    by 0x5FD8E8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:24)
==1==    by 0x4EBA45: InstrumentedMutexLock (instrumented_mutex.h:56)
==1==    by 0x4EBA45: rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2154)
==1==    by 0x6BBF6A: operator() (std_function.h:706)
==1==    by 0x6BBF6A: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==1==    by 0x6BC0E4: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==1==    by 0x7A521E: execute_native_thread_routine (in /rocksdb-host/db_bench)
==1==    by 0x4E3AAA0: start_thread (in /lib64/libpthread-2.12.so)
==1==    by 0x67D7C4C: clone (in /lib64/libc-2.12.so)
==1== 
==1== 24 bytes in 1 blocks are definitely lost in loss record 537 of 1,353
==1==    at 0x4C29680: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:385)
==1==    by 0x7A5115: __cxa_thread_atexit (in /rocksdb-host/db_bench)
==1==    by 0x5FD8E8: UnknownInlinedFun (instrumented_mutex.cc:69)
==1==    by 0x5FD8E8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:24)
==1==    by 0x4EC3B8: InstrumentedMutexLock (instrumented_mutex.h:56)
==1==    by 0x4EC3B8: rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2231)
==1==    by 0x4EC8D1: rocksdb::DBImpl::BGWorkCompaction(void*) (db_impl_compaction_flush.cc:2022)
==1==    by 0x6BBF6A: operator() (std_function.h:706)
==1==    by 0x6BBF6A: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==1==    by 0x6BC0E4: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==1==    by 0x7A521E: execute_native_thread_routine (in /rocksdb-host/db_bench)
==1==    by 0x4E3AAA0: start_thread (in /lib64/libpthread-2.12.so)
==1==    by 0x67D7C4C: clone (in /lib64/libc-2.12.so)

Expected behavior

No leaks reported by Valgrind or AddressSanitizer

Actual behavior

Leaks reported by Valgrind and AddressSanitizer

Steps to reproduce the behavior:

Dockerfile to build rocksdb and valgrind

db_bench command:

docker run --rm rocks-leak valgrind --leak-check=full ./db_bench --benchmarks="fillrandom,readrandom" --num 10000000

leaks reported:

==1== 
==1== HEAP SUMMARY:
==1==     in use at exit: 8,615,691 bytes in 7,555 blocks
==1==   total heap usage: 98,705,609 allocs, 98,698,054 frees, 161,920,894,533 bytes allocated
==1== 
==1== 24 bytes in 1 blocks are definitely lost in loss record 536 of 1,353
==1==    at 0x4C29680: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:385)
==1==    by 0x7A5115: __cxa_thread_atexit (in /rocksdb-host/db_bench)
==1==    by 0x5FD8E8: UnknownInlinedFun (instrumented_mutex.cc:69)
==1==    by 0x5FD8E8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:24)
==1==    by 0x4EBA45: InstrumentedMutexLock (instrumented_mutex.h:56)
==1==    by 0x4EBA45: rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2154)
==1==    by 0x6BBF6A: operator() (std_function.h:706)
==1==    by 0x6BBF6A: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==1==    by 0x6BC0E4: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==1==    by 0x7A521E: execute_native_thread_routine (in /rocksdb-host/db_bench)
==1==    by 0x4E3AAA0: start_thread (in /lib64/libpthread-2.12.so)
==1==    by 0x67D7C4C: clone (in /lib64/libc-2.12.so)
==1== 
==1== 24 bytes in 1 blocks are definitely lost in loss record 537 of 1,353
==1==    at 0x4C29680: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:385)
==1==    by 0x7A5115: __cxa_thread_atexit (in /rocksdb-host/db_bench)
==1==    by 0x5FD8E8: UnknownInlinedFun (instrumented_mutex.cc:69)
==1==    by 0x5FD8E8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:24)
==1==    by 0x4EC3B8: InstrumentedMutexLock (instrumented_mutex.h:56)
==1==    by 0x4EC3B8: rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2231)
==1==    by 0x4EC8D1: rocksdb::DBImpl::BGWorkCompaction(void*) (db_impl_compaction_flush.cc:2022)
==1==    by 0x6BBF6A: operator() (std_function.h:706)
==1==    by 0x6BBF6A: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==1==    by 0x6BC0E4: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==1==    by 0x7A521E: execute_native_thread_routine (in /rocksdb-host/db_bench)
==1==    by 0x4E3AAA0: start_thread (in /lib64/libpthread-2.12.so)
==1==    by 0x67D7C4C: clone (in /lib64/libc-2.12.so)
==1== 
==1== 11,996 bytes in 3 blocks are possibly lost in loss record 1,347 of 1,353
==1==    at 0x4C29F64: operator new[](unsigned long) (vg_replace_malloc.c:431)
==1==    by 0x66D317: AllocateBlock (memory_allocator.h:35)
==1==    by 0x66D317: rocksdb::UncompressBlockContentsForCompressionType(rocksdb::UncompressionInfo const&, char const*, unsigned long, rocksdb::BlockContents*, unsigned int, rocksdb::ImmutableCFOptions const&, rocksdb::MemoryAllocator*) (format.cc:300)
==1==    by 0x66DB86: rocksdb::UncompressBlockContents(rocksdb::UncompressionInfo const&, char const*, unsigned long, rocksdb::BlockContents*, unsigned int, rocksdb::ImmutableCFOptions const&, rocksdb::MemoryAllocator*) (format.cc:409)
==1==    by 0x660341: rocksdb::BlockFetcher::ReadBlockContents() (block_fetcher.cc:252)
==1==    by 0x64EB4A: rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache(rocksdb::FilePrefetchBuffer*, rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool, rocksdb::GetContext*) (block_based_table_reader.cc:2140)
==1==    by 0x65B9FE: rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, rocksdb::Status, rocksdb::FilePrefetchBuffer*) (block_based_table_reader.cc:1983)
==1==    by 0x65744B: rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, bool) (block_based_table_reader.cc:2765)
==1==    by 0x583CF4: rocksdb::TableCache::Get(rocksdb::ReadOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, rocksdb::HistogramImpl*, bool, int) (table_cache.cc:392)
==1==    by 0x5A4589: rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, unsigned long*, bool*, bool*, unsigned long*, rocksdb::ReadCallback*, bool*) (version_set.cc:1636)
==1==    by 0x4BB18D: rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*, bool*, rocksdb::ReadCallback*, bool*) (db_impl.cc:1447)
==1==    by 0x4BB3C6: rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*) (db_impl.cc:1356)
==1==    by 0x44FDD2: rocksdb::Benchmark::ReadRandom(rocksdb::ThreadState*) (db_bench_tool.cc:4597)
==1== 
==1== 8,553,231 (87 direct, 8,553,144 indirect) bytes in 1 blocks are definitely lost in loss record 1,353 of 1,353
==1==    at 0x4C29F64: operator new[](unsigned long) (vg_replace_malloc.c:431)
==1==    by 0x4669AD: rocksdb::LRUCacheShard::Insert(rocksdb::Slice const&, unsigned int, void*, unsigned long, void (*)(rocksdb::Slice const&, void*), rocksdb::Cache::Handle**, rocksdb::Cache::Priority) (lru_cache.cc:348)
==1==    by 0x467701: rocksdb::ShardedCache::Insert(rocksdb::Slice const&, void*, unsigned long, void (*)(rocksdb::Slice const&, void*), rocksdb::Cache::Handle**, rocksdb::Cache::Priority) (sharded_cache.cc:55)
==1==    by 0x64A91F: rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, rocksdb::BlockContents*, rocksdb::CompressionType, unsigned int, rocksdb::UncompressionDict const&, unsigned long, unsigned long, rocksdb::MemoryAllocator*, bool, rocksdb::Cache::Priority, rocksdb::GetContext*) (block_based_table_reader.cc:1558)
==1==    by 0x64EC1E: rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache(rocksdb::FilePrefetchBuffer*, rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool, rocksdb::GetContext*) (block_based_table_reader.cc:2148)
==1==    by 0x65B9FE: rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, bool, bool, bool, rocksdb::GetContext*, rocksdb::Status, rocksdb::FilePrefetchBuffer*) (block_based_table_reader.cc:1983)
==1==    by 0x65744B: rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, bool) (block_based_table_reader.cc:2765)
==1==    by 0x583CF4: rocksdb::TableCache::Get(rocksdb::ReadOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::SliceTransform const*, rocksdb::HistogramImpl*, bool, int) (table_cache.cc:392)
==1==    by 0x5A4589: rocksdb::Version::Get(rocksdb::ReadOptions const&, rocksdb::LookupKey const&, rocksdb::PinnableSlice*, rocksdb::Status*, rocksdb::MergeContext*, unsigned long*, bool*, bool*, unsigned long*, rocksdb::ReadCallback*, bool*) (version_set.cc:1636)
==1==    by 0x4BB18D: rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*, bool*, rocksdb::ReadCallback*, bool*) (db_impl.cc:1447)
==1==    by 0x4BB3C6: rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::PinnableSlice*) (db_impl.cc:1356)
==1==    by 0x44FDD2: rocksdb::Benchmark::ReadRandom(rocksdb::ThreadState*) (db_bench_tool.cc:4597)
==1== 
==1== LEAK SUMMARY:
==1==    definitely lost: 135 bytes in 3 blocks
==1==    indirectly lost: 8,553,144 bytes in 6,205 blocks
==1==      possibly lost: 11,996 bytes in 3 blocks
==1==    still reachable: 50,416 bytes in 1,344 blocks
==1==                       of which reachable via heuristic:
==1==                         stdstring          : 864 bytes in 9 blocks
==1==         suppressed: 0 bytes in 0 blocks
==1== Reachable blocks (those to which a pointer was found) are not shown.
==1== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1== 
==1== For lists of detected and suppressed errors, rerun with: -s
==1== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 4 from 4)

full output

@cculianu
Copy link
Contributor

I'm seeing very high memory usage on rocksdb when running a process for a week. I am wondering if this is related. Any ideas? Commenting here to bump this...

@thatsafunnyname
Copy link
Contributor

I am also seeing the same definitely lost records reported by valgrind with RocksDB 6.11.4, it is enough to trigger a compaction with db_bench to see it.

I found this issue after seeing the same 2 valgrind stacks for __cxa_thread_atexit in a unit test from our build while upgrading from 5.15.10 to 6.11.4.

The 2 __cxa_thread_atexit leaks start to appear in ~ 5.18.0 with d6ec288 from 17 Oct 2018 for #4226 , in the PR there is a discussion about how to perform cleanup.

This is on RHEL 7.8 with libstdc++-4.8.5-39 which is currently the latest.

( I need to edit util/gflags_compat.h probably because the gflags RPM is old)

<< #define GFLAGS_NAMESPACE google
>> #define GFLAGS_NAMESPACE gflags
> make db_bench -j48

This is for 6.11.4 but the same 2 valgrind __cxa_thread_atexit "definite lost" stacks appear back to 5.18.0

> valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
==42753== Memcheck, a memory error detector
==42753== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==42753== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==42753== Command: ./db_bench --benchmarks=fillseq,compact --num 1
==42753==
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==42753== Warning: unimplemented fcntl command: 1036
RocksDB:    version 6.11
Date:       Thu Aug 13 11:08:22 2020
CPU:        48 * Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
CPUCache:   30720 KB
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    1
Prefix:    0 bytes
Keys per prefix:    0
RawSize:    0.0 MB (estimated)
FileSize:   0.0 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: skip_list
Perf Level: 1
WARNING: Assertions are enabled; benchmarks unnecessarily slow
------------------------------------------------
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==42753== Warning: unimplemented fcntl command: 1036
DB path: [/tmp/rocksdbtest-XXX/dbbench]
fillseq      :  117194.000 micros/op 8 ops/sec;    0.0 MB/s
DB path: [/tmp/rocksdbtest-XXX/dbbench]
==42753== Warning: unimplemented fcntl command: 1036
==42753== Warning: unimplemented fcntl command: 1036
compact      :  472803.000 micros/op 2 ops/sec;
==42753==
==42753== HEAP SUMMARY:
==42753==     in use at exit: 80,220 bytes in 1,545 blocks
==42753==   total heap usage: 22,932 allocs, 21,387 frees, 9,072,040 bytes allocated
==42753==
==42753== 24 bytes in 1 blocks are definitely lost in loss record 586 of 1,482
==42753==    at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==42753==    by 0x61D6C5D: __cxa_thread_atexit (atexit_thread.cc:130)
==42753==    by 0x64AEC8: UnknownInlinedFun (instrumented_mutex.cc:71)
==42753==    by 0x64AEC8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==42753==    by 0x506B56: InstrumentedMutexLock (instrumented_mutex.h:56)
==42753==    by 0x506B56: rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2303)
==42753==    by 0x5073A2: rocksdb::DBImpl::BGWorkFlush(void*) (db_impl_compaction_flush.cc:2162)
==42753==    by 0x723CEB: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==42753==    by 0x723F30: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==42753==    by 0x622F06F: execute_native_thread_routine (thread.cc:84)
==42753==    by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==42753==    by 0x6A978DC: clone (in /usr/lib64/libc-2.17.so)
==42753==
==42753== 24 bytes in 1 blocks are definitely lost in loss record 587 of 1,482
==42753==    at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==42753==    by 0x61D6C5D: __cxa_thread_atexit (atexit_thread.cc:130)
==42753==    by 0x64AEC8: UnknownInlinedFun (instrumented_mutex.cc:71)
==42753==    by 0x64AEC8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==42753==    by 0x50787B: InstrumentedMutexLock (instrumented_mutex.h:56)
==42753==    by 0x50787B: rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2382)
==42753==    by 0x50825B: rocksdb::DBImpl::BGWorkCompaction(void*) (db_impl_compaction_flush.cc:2174)
==42753==    by 0x723CEB: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==42753==    by 0x723F30: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==42753==    by 0x622F06F: execute_native_thread_routine (thread.cc:84)
==42753==    by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==42753==    by 0x6A978DC: clone (in /usr/lib64/libc-2.17.so)
==42753==
==42753== 24,576 (16,384 direct, 8,192 indirect) bytes in 1 blocks are definitely lost in loss record 1,482 of 1,482
==42753==    at 0x4C2C375: memalign (vg_replace_malloc.c:908)
==42753==    by 0x4C2C486: posix_memalign (vg_replace_malloc.c:1073)
==42753==    by 0x67FEB6: rocksdb::port::cacheline_aligned_alloc(unsigned long) (port_posix.cc:210)
==42753==    by 0x46218A: rocksdb::LRUCache::LRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:477)
==42753==    by 0x46238C: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==42753==    by 0x46238C: _S_construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==42753==    by 0x46238C: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==42753==    by 0x46238C: _Sp_counted_ptr_inplace<long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:399)
==42753==    by 0x46238C: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==42753==    by 0x46238C: _S_construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==42753==    by 0x46238C: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==42753==    by 0x46238C: __shared_count<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:502)
==42753==    by 0x46238C: __shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:957)
==42753==    by 0x46238C: shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:316)
==42753==    by 0x46238C: allocate_shared<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:598)
==42753==    by 0x46238C: make_shared<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:614)
==42753==    by 0x46238C: rocksdb::NewLRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:572)
==42753==    by 0x43A5C5: rocksdb::Benchmark::NewCache(long) (db_bench_tool.cc:2656)
==42753==    by 0x43E885: rocksdb::Benchmark::Benchmark() (db_bench_tool.cc:2685)
==42753==    by 0x4312EE: rocksdb::db_bench_tool(int, char**) (db_bench_tool.cc:7155)
==42753==    by 0x69BB554: (below main) (in /usr/lib64/libc-2.17.so)
==42753==
==42753== LEAK SUMMARY:
==42753==    definitely lost: 16,432 bytes in 3 blocks
==42753==    indirectly lost: 8,192 bytes in 64 blocks
==42753==      possibly lost: 0 bytes in 0 blocks
==42753==    still reachable: 55,596 bytes in 1,478 blocks
==42753==                       of which reachable via heuristic:
==42753==                         stdstring          : 977 bytes in 11 blocks
==42753==         suppressed: 0 bytes in 0 blocks
==42753== Reachable blocks (those to which a pointer was found) are not shown.
==42753== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==42753==
==42753== For lists of detected and suppressed errors, rerun with: -s
==42753== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

@siying
Copy link
Contributor

siying commented Aug 13, 2020

@thatsafunnyname the LRU cache leaking is intended. This can be avoided if you remove this line: https://github.com/facebook/rocksdb/blob/master/tools/db_bench_tool.cc#L2764 . This line is to intentionally leak blocks in the block cache to speed up the program shut down process.

I can't reproduce the __cxa_thread_atexit leak in my environment in master. Will try 6.11.4

@siying
Copy link
Contributor

siying commented Aug 13, 2020

Hmm, I can't reproduce it in 6.11.4 either, Which allocator are you using? We usually allocator in glibc when running valgrind (DISABLE_JEMALLOC=1), as jemalloc doesn't go well with valgrind.

@thatsafunnyname
Copy link
Contributor

Thanks for taking a look and trying to reproduce.

I had started with the allocator we run with, so jemalloc 5.2.0, but had also tried jemalloc 4.5.0 (valgrind support being dropped in v5) and jemalloc 3.6.0.
I saw the same 2 __cxa_thread_atexit leaks in all of them.
I just also tried building with jemalloc disabled to use the glibc allocator:

  make DISABLE_JEMALLOC=1 db_bench -j48

I checked db_bench is not using libjemalloc with ldd and strace:
I still see same 2 __cxa_thread_atexit leaks.

I will try on an AWS EC2 AL2 host (no jemalloc libs installed) tomorrow, at the moment I am getting link problems to gflags when trying to build 6.11.4 on AL2.

@siying
Copy link
Contributor

siying commented Aug 13, 2020

@thatsafunnyname that's interesting. Thanks for trying it. Let us know what you found.

@thatsafunnyname
Copy link
Contributor

Update summary:

I ran into (illegal instruction) problems running valgrind with RDB built on AL2, I had to use PORTABLE=1 when building RDB to avoid this, but when built with PORTABLE=1 on an AL2 host I could not reproduce the __cxa_thread_atexit lost blocks.

I did an AL2 on-host build of the latest valgrind (valgrind-3.17.0.GIT) and it had the same valgrind problem (illegal instruction) when RDB was not built with PORTABLE=1.
While I was building valgrind I also built the latest valgrind from git back on a RHEL7 host (with no jemalloc libs present), it still reported the __cxa_thread_atexit lost blocks.
I also built RDB with PORTABLE=1 back on a RHEL7 host (with no jemalloc libs present), it still reported the __cxa_thread_atexit lost blocks.

I am going to test with some compilers other than gcc-c++-4.8.5-39 on the RHEL7 host.

Details of the valgrind error:

On a newly started:

"Amazon Linux 2" AMI with kernel 4.14.186-146.268.amzn2.x86_64 - amzn2-ami-hvm-2.0.20200722.0-x86_64-gp2 (ami-02354e95b39ca8dec)

sudo yum install gcc gcc-c++ # 7.3.1-9
mkdir gflags
cd gflags/
wget 'https://github.com/gflags/gflags/archive/v2.0.tar.gz'
gzip -d v2.0.tar.gz
tar -xvf v2.0.tar
cd gflags-2.0/
./configure
make
sudo make install
export LD_LIBRARY_PATH=/usr/local/lib

sudo yum install snappy-devel # 1.1.0-3

mkdir ~/rocksdb
cd ~/rocksdb
wget https://github.com/facebook/rocksdb/archive/v6.11.4.tar.gz
gzip -d v6.11.4.tar.gz
tar -xvf v6.11.4.tar
cd rocksdb-6.11.4
make db_bench

sudo yum install valgrind # 3.13.0-9

valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
==16138== Memcheck, a memory error detector
==16138== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16138== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16138== Command: ./db_bench --benchmarks=fillseq,compact --num 1
==16138==
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0x75 0x28 0xEF 0xC9 0x48 0xC7 0x43 0x10
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==16138== valgrind: Unrecognised instruction at address 0x74c3cc.
==16138==    at 0x74C3CC: __mutex_base (std_mutex.h:68)
==16138==    by 0x74C3CC: mutex (std_mutex.h:94)
==16138==    by 0x74C3CC: Data (sync_point_impl.h:26)
==16138==    by 0x74C3CC: SyncPoint (sync_point.cc:20)
==16138==    by 0x74C3CC: rocksdb::SyncPoint::GetInstance() (sync_point.cc:16)
==16138==    by 0x63603B: rocksdb::Env::Default() (env_posix.cc:507)
==16138==    by 0x69870A: rocksdb::DBOptions::DBOptions() (options.h:404)
==16138==    by 0x43934E: rocksdb::Options::Options() (options.h:1152)
==16138==    by 0x40A966: __static_initialization_and_destruction_0(int, int) [clone .constprop.1047] (db_bench_tool.cc:317)
==16138==    by 0x8658C4: __libc_csu_init (in /home/ec2-user/rocksdb/rocksdb-6.11.4/db_bench)
==16138==    by 0x6183FBA: (below main) (in /usr/lib64/libc-2.26.so)
==16138== Your program just tried to execute an instruction that Valgrind
==16138== did not recognise.  There are two possible reasons for this.
==16138== 1. Your program has a bug and erroneously jumped to a non-code
==16138==    location.  If you are running Memcheck and you just saw a
==16138==    warning about a bad jump, it's probably your program's fault.
==16138== 2. The instruction is legitimate but Valgrind doesn't handle it,
==16138==    i.e. it's Valgrind's fault.  If you think this is the case or
==16138==    you are not sure, please let us know and we'll try to fix it.
==16138== Either way, Valgrind will now raise a SIGILL signal which will
==16138== probably kill your program.
==16138==
==16138== Process terminating with default action of signal 4 (SIGILL)
==16138==  Illegal opcode at address 0x74C3CC
==16138==    at 0x74C3CC: __mutex_base (std_mutex.h:68)
==16138==    by 0x74C3CC: mutex (std_mutex.h:94)
==16138==    by 0x74C3CC: Data (sync_point_impl.h:26)
==16138==    by 0x74C3CC: SyncPoint (sync_point.cc:20)
==16138==    by 0x74C3CC: rocksdb::SyncPoint::GetInstance() (sync_point.cc:16)
==16138==    by 0x63603B: rocksdb::Env::Default() (env_posix.cc:507)
==16138==    by 0x69870A: rocksdb::DBOptions::DBOptions() (options.h:404)
==16138==    by 0x43934E: rocksdb::Options::Options() (options.h:1152)
==16138==    by 0x40A966: __static_initialization_and_destruction_0(int, int) [clone .constprop.1047] (db_bench_tool.cc:317)
==16138==    by 0x8658C4: __libc_csu_init (in /home/ec2-user/rocksdb/rocksdb-6.11.4/db_bench)
==16138==    by 0x6183FBA: (below main) (in /usr/lib64/libc-2.26.so)
==16138==
==16138== HEAP SUMMARY:
==16138==     in use at exit: 13,818 bytes in 217 blocks
==16138==   total heap usage: 218 allocs, 1 frees, 86,522 bytes allocated
==16138==
==16138== LEAK SUMMARY:
==16138==    definitely lost: 0 bytes in 0 blocks
==16138==    indirectly lost: 0 bytes in 0 blocks
==16138==      possibly lost: 0 bytes in 0 blocks
==16138==    still reachable: 13,818 bytes in 217 blocks
==16138==         suppressed: 0 bytes in 0 blocks
==16138== Reachable blocks (those to which a pointer was found) are not shown.
==16138== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==16138==
==16138== For counts of detected and suppressed errors, rerun with: -v
==16138== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Illegal instruction

I saw the same failure when using an AL2 on-host complied valgrind-3.17.0.GIT.
When I build with PORTABLE=1 make db_bench -j48 on the AL2 host valgrind does not fail with the illegal instruction and does not report the __cxa_thread_atexit lost blocks.

@thatsafunnyname
Copy link
Contributor

Using gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) from devtoolset-4 on the RHEL7 host, valgrind reports the 2 __cxa_thread_atexit lost blocks records as does clang++ 3.7.1.

On a RHEL8.2 host on AWS with kernel 4.18.0-193.el8.x86_64 , RHEL-8.2.0_HVM-20200423-x86_64-0-Hourly2-GP2 (ami-098f16afa9edf40be) I had to build RDB with PORTABLE=1 to avoid the illegal instruction error from valgrind, and could not reproduce the __cxa_thread_atexit lost blocks.

Details for the RHEL8.2 host on AWS EC2.

sudo yum install wget make gcc gcc-c++ # 8.3.1-5
mkdir gflags
cd gflags/
wget 'https://github.com/gflags/gflags/archive/v2.0.tar.gz'
gzip -d v2.0.tar.gz
tar -xvf v2.0.tar
cd gflags-2.0/
./configure
make
sudo make install
export LD_LIBRARY_PATH=/usr/local/lib
sudo yum install snappy-devel # 1.1.7-5 ( have to use an additional repo )
mkdir ~/rocksdb
cd ~/rocksdb
wget https://github.com/facebook/rocksdb/archive/v6.11.4.tar.gz
gzip -d v6.11.4.tar.gz
tar -xvf v6.11.4.tar
cd rocksdb-6.11.4
PORTABLE=1 make db_bench -j12
sudo yum install valgrind # 1:3.15.0-11
valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1

@thatsafunnyname
Copy link
Contributor

These are the steps to reproduce the __cxa_thread_atexit lost blocks, using a new RHEL7.7 host on AWS EC2.

As I can not reproduce it on "Amazon Linux 2" or RHEL8, and it only ever seems to be 24 bytes in each of the loss records (per __cxa_thread_atexit ?), I will add a valgrind suppresion for it.

On a RHEL7.7 host on AWS with kernel 3.10.0-1062.1.2.el7.x86_64 , RHEL-7.7_HVM-20190923-x86_64-0-Hourly2-GP2 (ami-029c0fbe456d58bd1) , building RocksDB with PORTABLE=1.

sudo yum install wget make gcc gcc-c++ # 4.8.5-39
mkdir gflags
cd gflags/
wget 'https://github.com/gflags/gflags/archive/v2.0.tar.gz'
gzip -d v2.0.tar.gz
tar -xvf v2.0.tar
cd gflags-2.0/
./configure
make
sudo make install
export LD_LIBRARY_PATH=/usr/local/lib
sudo yum install snappy-devel # 1.1.0-3 
( may have to use an additional repo such as 
  "sudo rpm -i http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm" )
mkdir ~/rocksdb
cd ~/rocksdb
wget https://github.com/facebook/rocksdb/archive/v6.11.4.tar.gz
gzip -d v6.11.4.tar.gz
tar -xvf v6.11.4.tar
cd rocksdb-6.11.4
PORTABLE=1 make db_bench -j48
sudo yum install valgrind # 1:3.15.0-11
valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
==17872== Memcheck, a memory error detector
==17872== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17872== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==17872== Command: ./db_bench --benchmarks=fillseq,compact --num 1
==17872==
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==17872== Warning: unimplemented fcntl command: 1036
RocksDB:    version 6.11
Date:       Fri Aug 14 13:27:27 2020
CPU:        48 * Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz
CPUCache:   36608 KB
Keys:       16 bytes each
Values:     100 bytes each (50 bytes after compression)
Entries:    1
Prefix:    0 bytes
Keys per prefix:    0
RawSize:    0.0 MB (estimated)
FileSize:   0.0 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: skip_list
Perf Level: 1
WARNING: Assertions are enabled; benchmarks unnecessarily slow
------------------------------------------------
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==17872== Warning: unimplemented fcntl command: 1036
DB path: [/tmp/rocksdbtest-1000/dbbench]
fillseq      :   94966.000 micros/op 10 ops/sec;    0.0 MB/s
DB path: [/tmp/rocksdbtest-1000/dbbench]
==17872== Warning: unimplemented fcntl command: 1036
==17872== Warning: unimplemented fcntl command: 1036
compact      :  392140.000 micros/op 2 ops/sec;
==17872==
==17872== HEAP SUMMARY:
==17872==     in use at exit: 80,206 bytes in 1,545 blocks
==17872==   total heap usage: 22,989 allocs, 21,444 frees, 9,105,619 bytes allocated
==17872==
==17872== 24 bytes in 1 blocks are definitely lost in loss record 586 of 1,482
==17872==    at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==17872==    by 0x58E1C5D: __cxa_thread_atexit (in /usr/lib64/libstdc++.so.6.0.19)
==17872==    by 0x643938: UnknownInlinedFun (instrumented_mutex.cc:71)
==17872==    by 0x643938: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==17872==    by 0x502166: InstrumentedMutexLock (instrumented_mutex.h:56)
==17872==    by 0x502166: rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2303)
==17872==    by 0x502982: rocksdb::DBImpl::BGWorkFlush(void*) (db_impl_compaction_flush.cc:2162)
==17872==    by 0x716E6B: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==17872==    by 0x7170A0: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==17872==    by 0x593A06F: ??? (in /usr/lib64/libstdc++.so.6.0.19)
==17872==    by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==17872==    by 0x61A28DC: clone (in /usr/lib64/libc-2.17.so)
==17872==
==17872== 24 bytes in 1 blocks are definitely lost in loss record 587 of 1,482
==17872==    at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==17872==    by 0x58E1C5D: __cxa_thread_atexit (in /usr/lib64/libstdc++.so.6.0.19)
==17872==    by 0x643938: UnknownInlinedFun (instrumented_mutex.cc:71)
==17872==    by 0x643938: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==17872==    by 0x502E5B: InstrumentedMutexLock (instrumented_mutex.h:56)
==17872==    by 0x502E5B: rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2382)
==17872==    by 0x50380B: rocksdb::DBImpl::BGWorkCompaction(void*) (db_impl_compaction_flush.cc:2174)
==17872==    by 0x716E6B: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==17872==    by 0x7170A0: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==17872==    by 0x593A06F: ??? (in /usr/lib64/libstdc++.so.6.0.19)
==17872==    by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==17872==    by 0x61A28DC: clone (in /usr/lib64/libc-2.17.so)
==17872==
==17872== 24,576 (16,384 direct, 8,192 indirect) bytes in 1 blocks are definitely lost in loss record 1,482 of 1,482
==17872==    at 0x4C2C375: memalign (vg_replace_malloc.c:908)
==17872==    by 0x4C2C43F: posix_memalign (vg_replace_malloc.c:1072)
==17872==    by 0x678066: rocksdb::port::cacheline_aligned_alloc(unsigned long) (port_posix.cc:210)
==17872==    by 0x45EEEB: rocksdb::LRUCache::LRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:477)
==17872==    by 0x45F0DC: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==17872==    by 0x45F0DC: _S_construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==17872==    by 0x45F0DC: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==17872==    by 0x45F0DC: _Sp_counted_ptr_inplace<long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:399)
==17872==    by 0x45F0DC: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==17872==    by 0x45F0DC: _S_construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==17872==    by 0x45F0DC: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==17872==    by 0x45F0DC: __shared_count<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:502)
==17872==    by 0x45F0DC: __shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:957)
==17872==    by 0x45F0DC: shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:316)
==17872==    by 0x45F0DC: allocate_shared<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:598)
==17872==    by 0x45F0DC: make_shared<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:614)
==17872==    by 0x45F0DC: rocksdb::NewLRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:572)
==17872==    by 0x4387C5: rocksdb::Benchmark::NewCache(long) (db_bench_tool.cc:2656)
==17872==    by 0x43C9C5: rocksdb::Benchmark::Benchmark() (db_bench_tool.cc:2685)
==17872==    by 0x42FD6E: rocksdb::db_bench_tool(int, char**) (db_bench_tool.cc:7155)
==17872==    by 0x60C6554: (below main) (in /usr/lib64/libc-2.17.so)
==17872==
==17872== LEAK SUMMARY:
==17872==    definitely lost: 16,432 bytes in 3 blocks
==17872==    indirectly lost: 8,192 bytes in 64 blocks
==17872==      possibly lost: 0 bytes in 0 blocks
==17872==    still reachable: 55,582 bytes in 1,478 blocks
==17872==                       of which reachable via heuristic:
==17872==                         stdstring          : 979 bytes in 11 blocks
==17872==         suppressed: 0 bytes in 0 blocks
==17872== Reachable blocks (those to which a pointer was found) are not shown.
==17872== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==17872==
==17872== For lists of detected and suppressed errors, rerun with: -s
==17872== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

@thatsafunnyname
Copy link
Contributor

Also noticed at https://jira.mariadb.org/browse/MDEV-21788

@thatsafunnyname
Copy link
Contributor

0x58E1C5D: __cxa_thread_atexit (in /usr/lib64/libstdc++.so.6.0.19)

is

0x58E1C5D: __cxa_thread_atexit (atexit_thread.cc:130)

extern "C" int
__cxxabiv1::__cxa_thread_atexit (void (*dtor)(void *), void *obj, void */*dso_handle*/)
  _GLIBCXX_NOTHROW
{
  // Do this initialization once.
  if (__gthread_active_p ())
    {
      // When threads are active use __gthread_once.
      static __gthread_once_t once = __GTHREAD_ONCE_INIT;
      __gthread_once (&once, key_init);
    }
  else
    {
      // And when threads aren't active use a static local guard.
      static bool queued;
      if (!queued)
        {
          queued = true;
          std::atexit (run);
        }
    }

  elt *first;
  if (__gthread_active_p ())
    first = static_cast<elt*>(__gthread_getspecific (key));
  else
    first = single_thread;

  elt *new_elt = new (std::nothrow) elt;    <----------------- line 130 HERE
  if (!new_elt)
    return -1;
  new_elt->destructor = dtor;
  new_elt->object = obj;
  new_elt->next = first;

  if (__gthread_active_p ())
    __gthread_setspecific (key, new_elt);
  else
    single_thread = new_elt;

  return 0;
}

Similar issue reported at:

cameron314/concurrentqueue#152

and the upstream bug report, which I think this issue can be closed in favour of as "fixed in upstream".

https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg551653.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83029

@lday0321
Copy link

lday0321 commented Nov 6, 2020

We aslo meet this memory leak error with ASAN check:

#ASAN_OPTIONS=fast_unwind_on_malloc=true,detect_stack_use_after_return=1,detect_odr_violation=2,detect_container_overflow=1,log_path=stderr,new_delete_type_mismatch=1,alloc_dealloc_mismatch=1,suppressions=/v/asan_conf/asan.supp  ./column_family_test --gtest_filter=FormatDef/ColumnFamilyTest.FlushTest/0
Note: Google Test filter = FormatDef/ColumnFamilyTest.FlushTest/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FormatDef/ColumnFamilyTest
[ RUN      ] FormatDef/ColumnFamilyTest.FlushTest/0
[       OK ] FormatDef/ColumnFamilyTest.FlushTest/0 (2983 ms)
[----------] 1 test from FormatDef/ColumnFamilyTest (2984 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (2986 ms total)
[  PASSED  ] 1 test.

=================================================================
==24675==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fb5d4d6be9f in operator new(unsigned long, std::nothrow_t const&) (/lib64/libasan.so.5+0x10fe9f)
    #1 0x7fb5d1638135 in __cxa_thread_atexit (/lib64/libstdc++.so.6+0xa5135)

SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).

ns-codereview pushed a commit to couchbase/tlm that referenced this issue Sep 8, 2023
When moving from ubuntu to centos based workers, new memory leaks
emerged in some tests:

  ep-engine_ep_unit_tests.RocksFullOrValue/DurabilityWarmupTest:
    Direct leak of 48 byte(s) in 2 object(s) allocated from:
      #0 0x6e2e27 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3
      #1 0x7efc40b38fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37
    SUMMARY: AddressSanitizer: 48 byte(s) leaked in 2 allocation(s).

  ep_testsuite_checkpoint.value_eviction.rocksdb:
    Direct leak of 288 byte(s) in 12 object(s) allocated from:
      #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3
      #1 0x7faea5f13fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37
    SUMMARY: AddressSanitizer: 288 byte(s) leaked in 12 allocation(s).

  ep_testsuite_checkpoint.full_eviction.rocksdb:
    Direct leak of 360 byte(s) in 15 object(s) allocated from:
      #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3
      #1 0x7f3810c11fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37
    SUMMARY: AddressSanitizer: 360 byte(s) leaked in 15 allocation(s).

This appears to perhaps be an issue with rocksdb (similar errors in
facebook/rocksdb#5931) or maybe libstdc++
itself, but given we don't ship rocksdb, this change simply adds
__cxa_thread_atexit to the suppressions file.

Change-Id: I615822181c54aa7fc3b0c0db00d7b70008323008
Reviewed-on: https://review.couchbase.org/c/tlm/+/196843
Tested-by: Blair Watt <blair.watt@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
ns-codereview pushed a commit to couchbase/tlm that referenced this issue Nov 9, 2023
(Backport to fix Asan issues on neo branch under MB-59356)

When moving from ubuntu to centos based workers, new memory leaks
emerged in some tests:

  ep-engine_ep_unit_tests.RocksFullOrValue/DurabilityWarmupTest:
    Direct leak of 48 byte(s) in 2 object(s) allocated from:
      #0 0x6e2e27 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3
      #1 0x7efc40b38fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37
    SUMMARY: AddressSanitizer: 48 byte(s) leaked in 2 allocation(s).

  ep_testsuite_checkpoint.value_eviction.rocksdb:
    Direct leak of 288 byte(s) in 12 object(s) allocated from:
      #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3
      #1 0x7faea5f13fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37
    SUMMARY: AddressSanitizer: 288 byte(s) leaked in 12 allocation(s).

  ep_testsuite_checkpoint.full_eviction.rocksdb:
    Direct leak of 360 byte(s) in 15 object(s) allocated from:
      #0 0x571427 in operator new(unsigned long, std::nothrow_t const&) /tmp/llvm-project/compiler-rt/lib/asan/asan_new_delete.cc:105:3
      #1 0x7f3810c11fcd in __cxa_thread_atexit /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/atexit_thread.cc:146:37
    SUMMARY: AddressSanitizer: 360 byte(s) leaked in 15 allocation(s).

This appears to perhaps be an issue with rocksdb (similar errors in
facebook/rocksdb#5931) or maybe libstdc++
itself, but given we don't ship rocksdb, this change simply adds
__cxa_thread_atexit to the suppressions file.

Change-Id: I615822181c54aa7fc3b0c0db00d7b70008323008
Reviewed-on: https://review.couchbase.org/c/tlm/+/200473
Well-Formed: Restriction Checker
Reviewed-by: Trond Norbye <trond.norbye@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants