Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize data back to heaptrack in another thread + cache searches on trace tree #13

Closed

Conversation

brenoguim
Copy link

@brenoguim brenoguim commented Apr 16, 2018

Do not merge!

I've been using heaptrack every now and then, and while it's much faster than valgrind (which simply gives up), it takes over a day to run it on my case. So I spent some time to optimize it.

If improvements like this have been discussed and ruled out, please go ahead and close it. I imagine there may be subtle details which could invalidate this proposal

This pull request contains experimental changes to improve the performance of the allocation data serialization back to heaptrack interpreter. It's just a proof of concept, really.

There are two main changes here:

  1. Instead of calling fprintf on every allocation/deallocation, just add the data to a queue. A thread is created to consume data on this queue and call fprintf.
  • I'm a bit concerned fprintf may call malloc/free, which I suppose could cause deadlock. It hasn't deadlocked in any of the runs though, so maybe I'm missing something.
  1. Create a cache of "last search" on each node of the trace tree. This helps speeding up consecutive allocations on the same (or similar) stack.

I used the following testcase for performance validation:
main.txt

This testcase is tailored for the two optimizations, so I wouldn't expect the same results on real scenarios. I did not test this on my testcase yet (don't have access to the company network on weekend :) )

This does a bunch of string manipulations and sleep every now and then. The sleep is to simulate time not doing allocations on real applications.

  • Time without heaptrack: 54 seconds
  • Time with heaptrack: 100 seconds
  • Time with heaptrack + this patch: 74 seconds

I will validate this patch on my 1 day testcase during this week, and if it gives significant improvement, I can work on separate it on two or three changesets, fix bugs, improve error handling and so on...

I created a pull request to kick off a discussion (if it hasn't been discussed yet), and gather feedback.

@milianw
Copy link
Collaborator

milianw commented Apr 16, 2018

Some early comments, while we wait for your real-world test data to arrive:

  1. I find it strange that introducing buffering is improving things. Consider that the FILE API we are using is actually buffered already, no? Maybe we want to increase the internal buffer there?

  2. If against my intuition (see above) we do want to build our own buffer, then please at least reuse code form boost for a lockfree queue

  3. My long-term plan to optimize heaptrack I/O is to introduce a binary data protocol to get rid of the text conversion overhead. While this will drastically increase the size of an uncompressed heaptrack data stream, I don't think it's important as we apply zstd/gz compression anyways.

  4. Once we have binary data streams, I also would like to try and split the data streams apart. I.e. instead of generating just one file, actually create multiple files, one for every "type" of message. This will allow us to load the strings, trees and other meta data once, and then we can just go through the allocation data stream that should just have one 32bit integer entry per (de)allocation. These integers are then just indices, negative values will encode deallocations. The timestamps file will map times to offsets into the (de)allocation stream. This should drastically reduce the size of the data files, since we won't need to keep repeating ourselves for the "type" of the message. And it should also allow us to parse the allocation data stream in parallel eventually.

  5. Your idea to cache the last IP in TraceTree sounds like a very good idea, please split that out into a separate commit and we can merge it.

@brenoguim
Copy link
Author

I hope to get these results by the end of the week. Some comments on your feedback:

  1. I wasn't very explicit about that. The expensive part on fprintf is actually the string formatting part, since as you pointed, the actual sending is buffered.
  2. I refrained from using a lock-free queue, because it will (most likely) use malloc/free to add items to the queue. I wasn't sure if that would be ok, but I guess it is.
  3. This is probably the best solution. This could be done in two parts (right?) - change the communication of the user process and heaptrack_interpret, and change the serialization to the actual file. I guess just the first one would give significant performance improvement.
  4. This would be heaven I guess :)
  5. Great! Will do.

@brenoguim
Copy link
Author

As I moved to trunk, I got a few problems about "Trace recursion detected - corrupt data file?". I'll dig into that first.

@milianw
Copy link
Collaborator

milianw commented Apr 16, 2018

For an old data file, or for a new data file? There was a bug that got fixed recently which sometimes lead to bogus data files. I extended the error checking and then ensured this shouldn't happen anymore - or so I thought...

@brenoguim
Copy link
Author

Ran a bisect and heaptrack stopped working on my environment as of changeset:

commit ab2a8222da3cbf2c061578f8215a1b771ff5bb90
Author: Milian Wolff <milian.wolff@kdab.com>
Date:   Wed Jan 17 16:17:57 2018 +0100

    Check symbol table size before accessing it
    
    Fixes crash on 32bit systems when using attach.

diff --git a/src/track/heaptrack_inject.cpp b/src/track/heaptrack_inject.cpp
index 717d391..e67afa3 100644
--- a/src/track/heaptrack_inject.cpp
+++ b/src/track/heaptrack_inject.cpp
@@ -241,9 +241,11 @@ void try_overwrite_elftable(const T& jumps, const elf_string_table& strings, con
     const auto rela_end = reinterpret_cast<Elf::Rela*>(reinterpret_cast<char*>(jumps.table) + jumps.size);
     for (auto rela = jumps.table; rela < rela_end; rela++) {
         const auto index = ELF_R_SYM(rela->r_info);
-        const char* symname = strings.table + symbols.table[index].st_name;
-        auto addr = rela->r_offset + base;
-        hooks::apply(symname, addr, restore);
+        if (index > 0 && index < symbols.size) {
+            const char* symname = strings.table + symbols.table[index].st_name;
+            auto addr = rela->r_offset + base;
+            hooks::apply(symname, addr, restore);
+        }
     }
 }

Not sure why yet

@milianw
Copy link
Collaborator

milianw commented Apr 16, 2018

I doubt that, unless you are runtime attaching? Did you run the bisect test multiple times - maybe it's a race or other flaky failure?

@brenoguim
Copy link
Author

brenoguim commented Apr 16, 2018

The bisect is very stable. I am doing runtime attaching.
The index < symbols.size is the condition causing that behavior.

Commenting that clause on the head of the repository works for me. I'll put a pin on that for now.

@milianw
Copy link
Collaborator

milianw commented Apr 16, 2018

what is the value of index? What is the value of symbols.size? Note also that there was an issue in this commit which was fixed in the follow-up commit (it should read index >= 0).

I also don't see how this yields the "trace recursion" error you are encountering...

@brenoguim
Copy link
Author

I'll get the value of index as soon as I can. I'm running my tests now. I ran the baseline (trunk + index "fix") and it took 20 hours. I kicked off the version with this patch (trunk + index "fix" + this patch) and let's see how it goest tomorrow.

@brenoguim
Copy link
Author

brenoguim commented Apr 19, 2018

Well, 20 hours again. It did not improve the performance. That's sad, because then I'd have to start profiling my own application with heaptrack... but that takes 20h. It will be a nightmare.

Things to do:

  1. Create a smaller version of my testcase and profile it
  2. Create a separate patch for the trace tree caching
  3. Investigate the issue with the index (perhaps stop using the runtime injection?)

But I must say... I'm much more inclined to work on the binary serialization rather than this queue/threading complexity.

@milianw
Copy link
Collaborator

milianw commented Apr 19, 2018

How long does it take without heaptrack? I'd welcome if you could do 2. and 3. The former can be applied right away, the latter is a serious bug that needs to be fixed, but which I cannot reproduce. And that's more important than performance :)

If you then want to work on binary serialization, let's discuss this afterwards in more detail.

@brenoguim
Copy link
Author

brenoguim commented Apr 19, 2018

It takes around 1h30 to run without heaptrack. That's why I'm so eager to get performance improvement.
It might be relevant to mention it runs on a JVM, but the heavy stuff is done in C++.

I'll try to do 3 next then.

@milianw
Copy link
Collaborator

milianw commented Apr 23, 2018

Did you get a profile with perf to see where the time is actually spent?

kdesysadmin pushed a commit that referenced this pull request Apr 23, 2018
Hopefully fixes issues such as this one:

heaptrack debug(2) [8859:8859]@5830 dlopen_notify_callback: /tmp/KDevelop-5.2.1-x86_64/usr/lib/libxcb-dri2.so.0 7fd27a93d000
heaptrack debug(1) [8885:8885]@5830 child_fork()
heaptrack debug(1) [8859:8859]@5831 write error 4/Interrupted system call
#1  0x00007fd298297f1e sp=0x00007ffe33957bd0 _ZN12_GLOBAL__N_19HeapTrack10writeErrorEv + 0x4c
#2  0x00007fd298297cff sp=0x00007ffe33957c00 _ZN12_GLOBAL__N_19HeapTrack24dl_iterate_phdr_callbackEP12dl_phdr_infomPv + 0x13c
#3  0x00007fd29123fc81 sp=0x00007ffe33957c50 dl_iterate_phdr + 0x171
#4  0x00007fd298297ec1 sp=0x00007ffe33957d00 _ZN12_GLOBAL__N_19HeapTrack17updateModuleCacheEv + 0xa5
#5  0x00007fd298297ae0 sp=0x00007ffe33957d20 _ZN12_GLOBAL__N_19HeapTrack12handleMallocEPvmRK5Trace + 0x54
#6  0x00007fd298298805 sp=0x00007ffe33957d60 heaptrack_malloc + 0xef
#7  0x00007fd298295f9b sp=0x00007ffe33957fb0 malloc + 0x5f
#8  0x00007fd291ab6089 sp=0x00007ffe33957fe0 _Znwm + 0x19
#9  0x00007fd29205cd42 sp=0x00007ffe33957ff0 _ZN7QObjectC1EPS_ + 0x22
#10 0x00007fd292023374 sp=0x00007ffe33958010 _ZN8QLibraryC2ERK7QStringP7QObject + 0x14
#11 0x00007fd292ef53d0 sp=0x00007ffe33958030 _ZN13KPluginLoader4loadEv + 0x60
#12 0x00007fd292ef54aa sp=0x00007ffe33958090 _ZN13KPluginLoader8instanceEv + 0xa
#13 0x00007fd292ef54ec sp=0x00007ffe339580a0 _ZN13KPluginLoader7factoryEv + 0x1c
#14 0x00007fd297d4eece sp=0x00007ffe33958140 _ZN8KDevelop16PluginController18loadPluginInternalERK7QString + 0x58e
#15 0x00007fd297d50396 sp=0x00007ffe339582a0 _ZN8KDevelop16PluginController10initializeEv + 0x8d6
#16 0x00007fd297d5ab85 sp=0x00007ffe339583f0 _ZN8KDevelop11CorePrivate10initializeENS_4Core5SetupERK7QString + 0xb95
#17 0x00007fd297d5c03d sp=0x00007ffe33958490 _ZN8KDevelop4Core10initializeENS0_5SetupERK7QString + 0x5d
#18 0x000000000040b472 sp=0x00007ffe339584c0 main + 0x5182
#19 0x00007fd29112ff4a sp=0x00007ffe339587d0 __libc_start_main + 0xea
#20 0x000000000040caed sp=0x00007ffe33958890 _start + 0x29
heaptrack debug(1) [8859:8859]@5858 shutdown()
heaptrack debug(1) [8859:8859]@5858 destroying LockedData
heaptrack debug(1) [8859:8859]@5858 done destroying LockedData
heaptrack debug(1) [8859:8859]@5885 shutdown() done

CCBUG: 393387
@milianw
Copy link
Collaborator

milianw commented Apr 27, 2018

Hey @brenoguim,

the fallout to fix https://bugs.kde.org/show_bug.cgi?id=393387 included some patches that should also drastically improve the performance of heaptrack under some conditions. Can you please test current heaptrack git master against your workload again to see how long it takes now? I hope it's now less than 20x slower than without it.

The new code base should also make it super trivial to add a separate writer thread, if desired. Though I don't believe that this should be required. The perf results I've measured locally don't point at I/O anymore.

Cheers

@brenoguim
Copy link
Author

That's awesome! I'll try that ASAP. Thanks @milianw !

@brenoguim
Copy link
Author

Hey! I tested the trunk and got the same 20h time.

But I was finally able to get perf to run on my actual work environment and it's embarrassing... Libunwind was the one taking all the time, not heaptrack. I still don't know why it's behaving SO bad.

Anyway, I built a recent libunwind from source and now the time dropped to 2 hours. It's amazing!
Now I can run my tests several times a day instead of once a day.

Thanks for this amazing tool!

PS: I was kind of hoping that running with the new libunwind the "index" problem would go away, but it didn't. Still have to look into that.

@brenoguim brenoguim closed this Apr 30, 2018
@milianw
Copy link
Collaborator

milianw commented May 1, 2018 via email

@brenoguim
Copy link
Author

I'm glad this worked out well for both of us! :)

@brenoguim brenoguim deleted the feature/performance_improvements branch May 23, 2018 18:55
kdesysadmin pushed a commit that referenced this pull request Nov 28, 2018
If we get into the rare situation of accessing two separate files
through the buffered C I/O API, we could end up in a deadlock:

Thread 2 (Thread 0x7ffff77cb700 (LWP 9265)):
#0  0x00007ffff7e8b12b in __lll_lock_wait_private () from /usr/lib/libc.so.6
#1  0x00007ffff7dffaa2 in __GI__IO_file_underflow () from /usr/lib/libc.so.6
#2  0x00007ffff7e00c76 in _IO_default_uflow () from /usr/lib/libc.so.6
#3  0x00007ffff7de1d58 in __GI__IO_vfscanf () from /usr/lib/libc.so.6
#4  0x00007ffff7def9d8 in fscanf () from /usr/lib/libc.so.6
#5  0x00007ffff7fb2235 in (anonymous namespace)::HeapTrack::writeRSS (this=0x7ffff77c4770) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:378
#6  0x00007ffff7fb2aee in (anonymous namespace)::HeapTrack::LockedData::<lambda()>::operator()(void) const (__closure=0x55555555a548)
    at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:602
#7  0x00007ffff7fb4420 in std::__invoke_impl<void, (anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> >(std::__invoke_other, (anonymous namespace)::HeapTrack::LockedData::<lambda()> &&) (__f=...) at /usr/include/c++/8.2.1/bits/invoke.h:60
#8  0x00007ffff7fb4023 in std::__invoke<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> >((anonymous namespace)::HeapTrack::LockedData::<lambda()> &&) (__fn=...) at /usr/include/c++/8.2.1/bits/invoke.h:95
#9  0x00007ffff7fb4905 in std::thread::_Invoker<std::tuple<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x55555555a548) at /usr/include/c++/8.2.1/thread:244
#10 0x00007ffff7fb48b6 in std::thread::_Invoker<std::tuple<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> > >::operator()(void) (
    this=0x55555555a548) at /usr/include/c++/8.2.1/thread:253
#11 0x00007ffff7fb487b in std::thread::_State_impl<std::thread::_Invoker<std::tuple<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> > > >::_M_run(void) (this=0x55555555a540) at /usr/include/c++/8.2.1/thread:196
#12 0x00007ffff7c5a063 in std::execute_native_thread_routine (__p=0x55555555a540) at /build/gcc/src/gcc/libstdc++-v3/src/c++11/thread.cc:80
#13 0x00007ffff7d63a9d in start_thread () from /usr/lib/libpthread.so.0
#14 0x00007ffff7e7db23 in clone () from /usr/lib/libc.so.6

Thread 1 (Thread 0x7ffff77d4dc0 (LWP 9256)):
#0  0x00007ffff7d6cedc in __lll_lock_wait () from /usr/lib/libpthread.so.0
#1  0x00007ffff7d66298 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
#2  0x00007ffff7fb4b90 in __gthread_mutex_lock (__mutex=0x7ffff7fcb8a0 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:748
#3  0x00007ffff7fb4d3b in std::mutex::lock (this=0x7ffff7fcb8a0 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/8.2.1/bits/std_mutex.h:103
#4  0x00007ffff7fb1bd8 in (anonymous namespace)::HeapTrack::HeapTrack (this=0x7fffffffbc4f) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:235
#5  0x00007ffff7fb3224 in heaptrack_malloc (ptr=0x55555556c460, size=1024) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:711
#6  0x00007ffff7fb0ac5 in malloc (size=1024) at /home/milian/projects/src/heaptrack/src/track/heaptrack_preload.cpp:164
#7  0x00007ffff7df2791 in _IO_file_doallocate () from /usr/lib/libc.so.6
#8  0x00007ffff7e00bc0 in _IO_doallocbuf () from /usr/lib/libc.so.6
#9  0x00007ffff7dffe48 in __GI__IO_file_overflow () from /usr/lib/libc.so.6
#10 0x00007ffff7dfef37 in __GI__IO_file_xsputn () from /usr/lib/libc.so.6
#11 0x00007ffff7dd2876 in vfprintf () from /usr/lib/libc.so.6
#12 0x00007ffff7e8cdf0 in __printf_chk () from /usr/lib/libc.so.6
#13 0x00005555555550ea in printf (__fmt=0x555555556004 "malloc: %p\n") at /usr/include/bits/stdio2.h:107
#14 main () at /home/milian/projects/src/heaptrack/tests/manual/test.c:33

Prevent this from happening by circumventing the buffered API and
use raw open/read/lseek/close instead. scanf should still be fine to
use, since it shouldn't allocate internally.
kdesysadmin pushed a commit that referenced this pull request Sep 4, 2023
Return when we get asked for invalid data which seems to happen
through the KDChart attributes model. Fixes assertions/crashes like:

```
#0  0x00007fdcfc88e83c in  () at /usr/lib/libc.so.6
#1  0x00007fdcfc83e668 in raise () at /usr/lib/libc.so.6
#2  0x00007fdcfc8264b8 in abort () at /usr/lib/libc.so.6
#3  0x00007fdcfd8a0098 in qt_assert(char const*, char const*, int) () at /usr/lib/libQt5Core.so.5
#4  0x00007fdcfd8a0157 in  () at /usr/lib/libQt5Core.so.5
#5  0x00005649da68cf3f in QVector<QBrush>::at(int) const (this=0x6070001170a0, i=-1) at /usr/include/qt/QtCore/qvector.h:449
#6  0x00005649da70272e in ChartModel::headerData(int, Qt::Orientation, int) const (this=0x607000117060, section=-1, orientation=Qt::Horizontal, role=175763350) at /home/milian/projects/src/heaptrack/src/analyze/gui/chartmodel.cpp:69
#7  0x00007fdcfda67f5e in QAbstractProxyModel::headerData(int, Qt::Orientation, int) const () at /usr/lib/libQt5Core.so.5
#8  0x00007fdcfda7ba54 in QSortFilterProxyModel::headerData(int, Qt::Orientation, int) const () at /usr/lib/libQt5Core.so.5
#9  0x00007fdcff9b6dbf in KChart::AttributesModel::headerData(int, Qt::Orientation, int) const () at /usr/lib/libKChart.so.2
#10 0x00007fdcff9af76a in  () at /usr/lib/libKChart.so.2
#11 0x00007fdcff9a7605 in KChart::AbstractDiagram::brush(int) const () at /usr/lib/libKChart.so.2
#12 0x00007fdcff9ac495 in KChart::AbstractDiagram::datasetBrushes() const () at /usr/lib/libKChart.so.2
#13 0x00007fdcff9ccd77 in  () at /usr/lib/libKChart.so.2
#14 0x00007fdcff9cda4b in KChart::Legend::buildLegend() () at /usr/lib/libKChart.so.2
#15 0x00007fdcff9ce4ba in KChart::Legend::setNeedRebuild() () at /usr/lib/libKChart.so.2
#16 0x00007fdcfdad17f3 in  () at /usr/lib/libQt5Core.so.5
#17 0x00007fdcff98c4f3 in KChart::DiagramObserver::diagramDataChanged(KChart::AbstractDiagram*) () at /usr/lib/libKChart.so.2
#18 0x00007fdcfdad17f3 in  () at /usr/lib/libQt5Core.so.5
#19 0x00007fdcfda59182 in QAbstractItemModel::modelReset(QAbstractItemModel::QPrivateSignal) () at /usr/lib/libQt5Core.so.5
#20 0x00007fdcfda8506d in  () at /usr/lib/libQt5Core.so.5
#21 0x00007fdcfdad17f3 in  () at /usr/lib/libQt5Core.so.5
#22 0x00007fdcfda59182 in QAbstractItemModel::modelReset(QAbstractItemModel::QPrivateSignal) () at /usr/lib/libQt5Core.so.5
#23 0x00005649da708068 in ChartModel::resetData(ChartData const&) (this=0x607000117060, data=...) at /home/milian/projects/src/heaptrack/src/analyze/gui/chartmodel.cpp:261
#24 0x00005649da569a5d in operator()(ChartData const&) const (__closure=0x604000258e20, data=...) at /home/milian/projects/src/heaptrack/src/analyze/gui/mainwindow.cpp:223
```

BUG: 473634
kdesysadmin pushed a commit that referenced this pull request Sep 5, 2023
When shutdown in one thread first locks the mutex, and the timer
thread triggers a memory allocation, it would trigger a deadlock
such as:

```
Thread 2 (Thread 0x7ffff3aeb6c0 (LWP 30899) "tst_inject"):
#0  0x00007ffff6889700 in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff688fec2 in pthread_mutex_lock () from /usr/lib/libc.so.6
#2  0x00007ffff3b8f089 in __gthread_mutex_lock (__mutex=0x7ffff3bffc00 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/13.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:749
#3  0x00007ffff3b8f27e in std::mutex::lock (this=0x7ffff3bffc00 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/13.2.1/bits/std_mutex.h:113
#4  0x00007ffff3b84416 in (anonymous namespace)::HeapTrack::HeapTrack (this=0x7ffff3aea6e0) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:263
#5  0x00007ffff3b8940b in heaptrack_realloc_impl (ptr_in=0x0, size=32, ptr_out=0x603000001810) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:804
#6  0x00007ffff3b89f06 in heaptrack_realloc2 (ptr_in=0, size=32, ptr_out=105759274702864) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:886
#7  0x00007ffff3b6c48c in (anonymous namespace)::hooks::realloc::hook (ptr=0x0, size=32) at /home/milian/projects/src/heaptrack/src/track/heaptrack_inject.cpp:140
#8  0x00007ffff688de8b in pthread_getattr_np () from /usr/lib/libc.so.6
#9  0x00007ffff79015a1 in __sanitizer::GetThreadStackTopAndBottom (at_initialization=at_initialization@entry=false, stack_top=stack_top@entry=0x7ffff3aeabd8, stack_bottom=stack_bottom@entry=0x7ffff3aeabe0) at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
#10 0x00007ffff79017ca in __sanitizer::GetThreadStackAndTls (main=false, stk_addr=0x7ffff7705020, stk_size=0x7ffff3aeac30, tls_addr=0x7ffff7705040, tls_size=0x7ffff3aeac28) at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
#11 0x00007ffff78f0ff4 in __asan::AsanThread::SetThreadStackAndTls (this=this@entry=0x7ffff7705000, options=<optimized out>) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.h:77
#12 0x00007ffff78f14ee in __asan::AsanThread::Init (this=this@entry=0x7ffff7705000, options=options@entry=0x0) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.cpp:234
#13 0x00007ffff78f19e5 in __asan::AsanThread::ThreadStart (this=0x7ffff7705000, os_id=30899) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.cpp:264
#14 0x00007ffff688c9eb in ?? () from /usr/lib/libc.so.6
#15 0x00007ffff6910dfc in ?? () from /usr/lib/libc.so.6

Thread 1 (Thread 0x7ffff7f19980 (LWP 30896) "tst_inject"):
#0  0x00007ffff68894ae in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff688e5f3 in ?? () from /usr/lib/libc.so.6
#2  0x00007ffff74e19b8 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:669
#3  std::thread::join (this=0x608000000270) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:134
#4  0x00007ffff3b88dde in (anonymous namespace)::HeapTrack::LockedData::~LockedData (this=0x608000000220, __in_chrg=<optimized out>) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:739
#5  0x00007ffff3b84c08 in (anonymous namespace)::HeapTrack::shutdown (this=0x7ffff4916540) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:359
--Type <RET> for more, q to quit, c to continue without paging--
#6  0x00007ffff3b8984e in heaptrack_stop () at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:839
#7  0x0000555555799eb7 in (anonymous namespace)::runInjectTest<DOCTEST_ANON_FUNC_14()::<lambda()>, DOCTEST_ANON_FUNC_14()::<lambda(void*)> >(struct {...}, struct {...}) (load=..., unload=...) at /home/milian/projects/src/heaptrack/tests/auto/no_asan/tst_inject.cpp:64
#8  0x00005555557938ba in DOCTEST_ANON_FUNC_14 () at /home/milian/projects/src/heaptrack/tests/auto/no_asan/tst_inject.cpp:80
#9  0x000055555578fb16 in doctest::Context::run (this=0x7ffff4800420) at /home/milian/projects/src/heaptrack/tests/auto/../../3rdparty/doctest.h:7013
#10 0x00005555557935ff in main (argc=1, argv=0x7fffffffc628) at /home/milian/projects/src/heaptrack/tests/auto/../../3rdparty/doctest.h:7091
(gdb)
```

To prevent this from happening, we now give up on trying to lock
the mutex when we are shutting down.
kdesysadmin pushed a commit that referenced this pull request Jun 19, 2024
When we call dl_iterate_phdr we need to have the heaptrack mutex
locked, otherwise we can get a AB/BA deadlock like this:

One thread is trying to update the module cache:
```
Thread 4 (Thread 0xffffa8e1ef40 (LWP 10016) "system-ui-execu"):
#0  futex_wait (private=0, expected=2, futex_word=0xffffb13bcad0 <_rtld_global+2744>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0xffffb13bcad0 <_rtld_global+2744>, private=private@entry=0) at lowlevellock.c:49
#2  0x0000ffffae7437b0 in lll_mutex_lock_optimized (mutex=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=mutex@entry=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:128
#4  0x0000ffffae7ed6ec in __GI___dl_iterate_phdr (callback=0xffffb1366d10 <(anonymous namespace)::HeapTrack::dl_iterate_phdr_callback(dl_phdr_info*, size_t, void*)>, data=0xffffa8e1e2d0) at dl-iteratephdr.c:39
#5  0x0000ffffb136982c in (anonymous namespace)::HeapTrack::updateModuleCache (this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:636
#6  (anonymous namespace)::HeapTrack::handleMalloc (this=this@entry=0xffffa8e1e2d0, ptr=ptr@entry=0xffff9c000b70, size=size@entry=16, trace=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:527
#7  0x0000ffffb13699f8 in (anonymous namespace)::HeapTrack::handleMalloc (trace=..., size=16, ptr=0xffff9c000b70, this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/util/linewriter.h:271
#8  operator() (heaptrack=..., __closure=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#9  (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:273
#10 heaptrack_malloc (ptr=0xffff9c000b70, size=16) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#11 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120
#12 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
```

The other one is trying to overwrite symbols:
```
Thread 1 (Thread 0xffffb0a8d020 (LWP 10011) "system-ui-execu"):
#0  0x0000ffffae774560 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, req=0xfffff3152150, rem=0xfffff3152150) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x0000ffffae779520 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x0000ffffb13699b4 in std::this_thread::sleep_for<long, std::ratio<1l, 1000000l> > (__rtime=..., __rtime=...) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/this_thread_sleep.h:82
#3  (anonymous namespace)::HeapTrack::tryLock<(anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> >(const (anonymous namespace)::RecursionGuard&, const heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)>&)::<lambda()> > (stopLockCheck=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:676
#4  (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:265
#5  heaptrack_malloc (ptr=0xaaaafeb07b10, size=55) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#6  0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120
#7  0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
#8  0x0000ffffb1365464 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0xffffb13b5856 "", __beg=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.tcc:219
#9  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.h:539
#10 (anonymous namespace)::cachedSymtabSize (path=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject--Type <RET> for more, q to quit, c to continue without paging--
.cpp:555
#11 (anonymous namespace)::iterate_phdrs (info=0xfffff31525d0, data=0x0) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:577
#12 0x0000ffffae7ed7f0 in __GI___dl_iterate_phdr (callback=0xffffb1364ea0 <(anonymous namespace)::iterate_phdrs(dl_phdr_info*, size_t, void*)>, data=0x0) at dl-iteratephdr.c:74
#13 0x0000ffffb1364870 in (anonymous namespace)::overwrite_symbols () at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:589
```
kdesysadmin pushed a commit that referenced this pull request Jun 19, 2024
When we call dl_iterate_phdr we need to have the heaptrack mutex
locked, otherwise we can get a AB/BA deadlock like this:

One thread is trying to update the module cache:
```
Thread 4 (Thread 0xffffa8e1ef40 (LWP 10016) "system-ui-execu"):
#0  futex_wait (private=0, expected=2, futex_word=0xffffb13bcad0 <_rtld_global+2744>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0xffffb13bcad0 <_rtld_global+2744>, private=private@entry=0) at lowlevellock.c:49
#2  0x0000ffffae7437b0 in lll_mutex_lock_optimized (mutex=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=mutex@entry=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:128
#4  0x0000ffffae7ed6ec in __GI___dl_iterate_phdr (callback=0xffffb1366d10 <(anonymous namespace)::HeapTrack::dl_iterate_phdr_callback(dl_phdr_info*, size_t, void*)>, data=0xffffa8e1e2d0) at dl-iteratephdr.c:39
#5  0x0000ffffb136982c in (anonymous namespace)::HeapTrack::updateModuleCache (this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:636
#6  (anonymous namespace)::HeapTrack::handleMalloc (this=this@entry=0xffffa8e1e2d0, ptr=ptr@entry=0xffff9c000b70, size=size@entry=16, trace=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:527
#7  0x0000ffffb13699f8 in (anonymous namespace)::HeapTrack::handleMalloc (trace=..., size=16, ptr=0xffff9c000b70, this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/util/linewriter.h:271
#8  operator() (heaptrack=..., __closure=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#9  (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:273
#10 heaptrack_malloc (ptr=0xffff9c000b70, size=16) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#11 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120
#12 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
```

The other one is trying to overwrite symbols:
```
Thread 1 (Thread 0xffffb0a8d020 (LWP 10011) "system-ui-execu"):
#0  0x0000ffffae774560 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, req=0xfffff3152150, rem=0xfffff3152150) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x0000ffffae779520 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x0000ffffb13699b4 in std::this_thread::sleep_for<long, std::ratio<1l, 1000000l> > (__rtime=..., __rtime=...) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/this_thread_sleep.h:82
#3  (anonymous namespace)::HeapTrack::tryLock<(anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> >(const (anonymous namespace)::RecursionGuard&, const heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)>&)::<lambda()> > (stopLockCheck=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:676
#4  (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:265
#5  heaptrack_malloc (ptr=0xaaaafeb07b10, size=55) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#6  0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120
#7  0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
#8  0x0000ffffb1365464 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0xffffb13b5856 "", __beg=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.tcc:219
#9  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.h:539
#10 (anonymous namespace)::cachedSymtabSize (path=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject--Type <RET> for more, q to quit, c to continue without paging--
.cpp:555
#11 (anonymous namespace)::iterate_phdrs (info=0xfffff31525d0, data=0x0) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:577
#12 0x0000ffffae7ed7f0 in __GI___dl_iterate_phdr (callback=0xffffb1364ea0 <(anonymous namespace)::iterate_phdrs(dl_phdr_info*, size_t, void*)>, data=0x0) at dl-iteratephdr.c:74
#13 0x0000ffffb1364870 in (anonymous namespace)::overwrite_symbols () at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:589
```
kdesysadmin pushed a commit that referenced this pull request Jun 19, 2024
When we call dl_iterate_phdr we need to have the heaptrack mutex
locked, otherwise we can get a AB/BA deadlock like this:

One thread is trying to update the module cache:
```
Thread 4 (Thread 0xffffa8e1ef40 (LWP 10016) "system-ui-execu"):
#0  futex_wait (private=0, expected=2, futex_word=0xffffb13bcad0 <_rtld_global+2744>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0xffffb13bcad0 <_rtld_global+2744>, private=private@entry=0) at lowlevellock.c:49
#2  0x0000ffffae7437b0 in lll_mutex_lock_optimized (mutex=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=mutex@entry=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:128
#4  0x0000ffffae7ed6ec in __GI___dl_iterate_phdr (callback=0xffffb1366d10 <(anonymous namespace)::HeapTrack::dl_iterate_phdr_callback(dl_phdr_info*, size_t, void*)>, data=0xffffa8e1e2d0) at dl-iteratephdr.c:39
#5  0x0000ffffb136982c in (anonymous namespace)::HeapTrack::updateModuleCache (this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:636
#6  (anonymous namespace)::HeapTrack::handleMalloc (this=this@entry=0xffffa8e1e2d0, ptr=ptr@entry=0xffff9c000b70, size=size@entry=16, trace=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:527
#7  0x0000ffffb13699f8 in (anonymous namespace)::HeapTrack::handleMalloc (trace=..., size=16, ptr=0xffff9c000b70, this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/util/linewriter.h:271
#8  operator() (heaptrack=..., __closure=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#9  (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:273
#10 heaptrack_malloc (ptr=0xffff9c000b70, size=16) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#11 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120
#12 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
```

The other one is trying to overwrite symbols:
```
Thread 1 (Thread 0xffffb0a8d020 (LWP 10011) "system-ui-execu"):
#0  0x0000ffffae774560 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, req=0xfffff3152150, rem=0xfffff3152150) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x0000ffffae779520 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x0000ffffb13699b4 in std::this_thread::sleep_for<long, std::ratio<1l, 1000000l> > (__rtime=..., __rtime=...) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/this_thread_sleep.h:82
#3  (anonymous namespace)::HeapTrack::tryLock<(anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> >(const (anonymous namespace)::RecursionGuard&, const heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)>&)::<lambda()> > (stopLockCheck=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:676
#4  (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:265
#5  heaptrack_malloc (ptr=0xaaaafeb07b10, size=55) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880
#6  0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120
#7  0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
#8  0x0000ffffb1365464 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0xffffb13b5856 "", __beg=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.tcc:219
#9  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.h:539
#10 (anonymous namespace)::cachedSymtabSize (path=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject--Type <RET> for more, q to quit, c to continue without paging--
.cpp:555
#11 (anonymous namespace)::iterate_phdrs (info=0xfffff31525d0, data=0x0) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:577
#12 0x0000ffffae7ed7f0 in __GI___dl_iterate_phdr (callback=0xffffb1364ea0 <(anonymous namespace)::iterate_phdrs(dl_phdr_info*, size_t, void*)>, data=0x0) at dl-iteratephdr.c:74
#13 0x0000ffffb1364870 in (anonymous namespace)::overwrite_symbols () at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:589
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants