-
-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize data back to heaptrack in another thread + cache searches on trace tree #13
Conversation
…hing searches on trace tree
Some early comments, while we wait for your real-world test data to arrive:
|
I hope to get these results by the end of the week. Some comments on your feedback:
|
As I moved to trunk, I got a few problems about "Trace recursion detected - corrupt data file?". I'll dig into that first. |
For an old data file, or for a new data file? There was a bug that got fixed recently which sometimes lead to bogus data files. I extended the error checking and then ensured this shouldn't happen anymore - or so I thought... |
Ran a bisect and heaptrack stopped working on my environment as of changeset:
Not sure why yet |
I doubt that, unless you are runtime attaching? Did you run the bisect test multiple times - maybe it's a race or other flaky failure? |
The bisect is very stable. I am doing runtime attaching. Commenting that clause on the head of the repository works for me. I'll put a pin on that for now. |
what is the value of I also don't see how this yields the "trace recursion" error you are encountering... |
I'll get the value of index as soon as I can. I'm running my tests now. I ran the baseline (trunk + index "fix") and it took 20 hours. I kicked off the version with this patch (trunk + index "fix" + this patch) and let's see how it goest tomorrow. |
Well, 20 hours again. It did not improve the performance. That's sad, because then I'd have to start profiling my own application with heaptrack... but that takes 20h. It will be a nightmare. Things to do:
But I must say... I'm much more inclined to work on the binary serialization rather than this queue/threading complexity. |
How long does it take without heaptrack? I'd welcome if you could do 2. and 3. The former can be applied right away, the latter is a serious bug that needs to be fixed, but which I cannot reproduce. And that's more important than performance :) If you then want to work on binary serialization, let's discuss this afterwards in more detail. |
It takes around 1h30 to run without heaptrack. That's why I'm so eager to get performance improvement. I'll try to do 3 next then. |
Did you get a profile with |
Hopefully fixes issues such as this one: heaptrack debug(2) [8859:8859]@5830 dlopen_notify_callback: /tmp/KDevelop-5.2.1-x86_64/usr/lib/libxcb-dri2.so.0 7fd27a93d000 heaptrack debug(1) [8885:8885]@5830 child_fork() heaptrack debug(1) [8859:8859]@5831 write error 4/Interrupted system call #1 0x00007fd298297f1e sp=0x00007ffe33957bd0 _ZN12_GLOBAL__N_19HeapTrack10writeErrorEv + 0x4c #2 0x00007fd298297cff sp=0x00007ffe33957c00 _ZN12_GLOBAL__N_19HeapTrack24dl_iterate_phdr_callbackEP12dl_phdr_infomPv + 0x13c #3 0x00007fd29123fc81 sp=0x00007ffe33957c50 dl_iterate_phdr + 0x171 #4 0x00007fd298297ec1 sp=0x00007ffe33957d00 _ZN12_GLOBAL__N_19HeapTrack17updateModuleCacheEv + 0xa5 #5 0x00007fd298297ae0 sp=0x00007ffe33957d20 _ZN12_GLOBAL__N_19HeapTrack12handleMallocEPvmRK5Trace + 0x54 #6 0x00007fd298298805 sp=0x00007ffe33957d60 heaptrack_malloc + 0xef #7 0x00007fd298295f9b sp=0x00007ffe33957fb0 malloc + 0x5f #8 0x00007fd291ab6089 sp=0x00007ffe33957fe0 _Znwm + 0x19 #9 0x00007fd29205cd42 sp=0x00007ffe33957ff0 _ZN7QObjectC1EPS_ + 0x22 #10 0x00007fd292023374 sp=0x00007ffe33958010 _ZN8QLibraryC2ERK7QStringP7QObject + 0x14 #11 0x00007fd292ef53d0 sp=0x00007ffe33958030 _ZN13KPluginLoader4loadEv + 0x60 #12 0x00007fd292ef54aa sp=0x00007ffe33958090 _ZN13KPluginLoader8instanceEv + 0xa #13 0x00007fd292ef54ec sp=0x00007ffe339580a0 _ZN13KPluginLoader7factoryEv + 0x1c #14 0x00007fd297d4eece sp=0x00007ffe33958140 _ZN8KDevelop16PluginController18loadPluginInternalERK7QString + 0x58e #15 0x00007fd297d50396 sp=0x00007ffe339582a0 _ZN8KDevelop16PluginController10initializeEv + 0x8d6 #16 0x00007fd297d5ab85 sp=0x00007ffe339583f0 _ZN8KDevelop11CorePrivate10initializeENS_4Core5SetupERK7QString + 0xb95 #17 0x00007fd297d5c03d sp=0x00007ffe33958490 _ZN8KDevelop4Core10initializeENS0_5SetupERK7QString + 0x5d #18 0x000000000040b472 sp=0x00007ffe339584c0 main + 0x5182 #19 0x00007fd29112ff4a sp=0x00007ffe339587d0 __libc_start_main + 0xea #20 0x000000000040caed sp=0x00007ffe33958890 _start + 0x29 heaptrack debug(1) [8859:8859]@5858 shutdown() heaptrack debug(1) [8859:8859]@5858 destroying LockedData heaptrack debug(1) [8859:8859]@5858 done destroying LockedData heaptrack debug(1) [8859:8859]@5885 shutdown() done CCBUG: 393387
Hey @brenoguim, the fallout to fix https://bugs.kde.org/show_bug.cgi?id=393387 included some patches that should also drastically improve the performance of heaptrack under some conditions. Can you please test current heaptrack git master against your workload again to see how long it takes now? I hope it's now less than 20x slower than without it. The new code base should also make it super trivial to add a separate writer thread, if desired. Though I don't believe that this should be required. The perf results I've measured locally don't point at I/O anymore. Cheers |
That's awesome! I'll try that ASAP. Thanks @milianw ! |
Hey! I tested the trunk and got the same 20h time. But I was finally able to get perf to run on my actual work environment and it's embarrassing... Libunwind was the one taking all the time, not heaptrack. I still don't know why it's behaving SO bad. Anyway, I built a recent libunwind from source and now the time dropped to 2 hours. It's amazing! Thanks for this amazing tool! PS: I was kind of hoping that running with the new libunwind the "index" problem would go away, but it didn't. Still have to look into that. |
On Montag, 30. April 2018 12:52:47 CEST Breno Rodrigues Guimarães wrote:
Hey! I tested the trunk and got the same 20h time.
But I was finally able to get perf to run on my actual work environment and
it's embarrassing... Libunwind was the one taking all the time, not
heaptrack. I still don't know why it's behaving SO bad.
Anyway, I built a recent libunwind from source and now the time dropped to 2
hours. It's amazing! Now I can run my tests several times a day instead of
once a day.
Yay! You probably ran a very old libunwind version then. Basically anything
prior to 1.2 is pretty damn slow. I should probably add a runtime warning for
this somehow to make people better aware of this.
Thanks for this amazing tool!
Thanks for the feedback, much appreciated :)
PS: I was kind of hoping that running with the new libunwind the "index"
problem would go away, but it didn't. Still have to look into that.
Hah, thanks Breno for bringing this up again! I actually was in the middle of
a debug session for why runtime attaching didn't work. So I looked at this
commit again and indeed, it's complete rubbish. I've reverted it again and
explained why it's so bad, cf.:
https://github.com/KDE/heaptrack/commit/
8aec0f5
This should make vanilla heaptrack useable for you again, I hope!
Thanks a lot for bringin this to my attention, I'd have wasted tons of time
without your hint here.
Cheers
…--
Milian Wolff
mail@milianw.de
http://milianw.de
|
I'm glad this worked out well for both of us! :) |
If we get into the rare situation of accessing two separate files through the buffered C I/O API, we could end up in a deadlock: Thread 2 (Thread 0x7ffff77cb700 (LWP 9265)): #0 0x00007ffff7e8b12b in __lll_lock_wait_private () from /usr/lib/libc.so.6 #1 0x00007ffff7dffaa2 in __GI__IO_file_underflow () from /usr/lib/libc.so.6 #2 0x00007ffff7e00c76 in _IO_default_uflow () from /usr/lib/libc.so.6 #3 0x00007ffff7de1d58 in __GI__IO_vfscanf () from /usr/lib/libc.so.6 #4 0x00007ffff7def9d8 in fscanf () from /usr/lib/libc.so.6 #5 0x00007ffff7fb2235 in (anonymous namespace)::HeapTrack::writeRSS (this=0x7ffff77c4770) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:378 #6 0x00007ffff7fb2aee in (anonymous namespace)::HeapTrack::LockedData::<lambda()>::operator()(void) const (__closure=0x55555555a548) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:602 #7 0x00007ffff7fb4420 in std::__invoke_impl<void, (anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> >(std::__invoke_other, (anonymous namespace)::HeapTrack::LockedData::<lambda()> &&) (__f=...) at /usr/include/c++/8.2.1/bits/invoke.h:60 #8 0x00007ffff7fb4023 in std::__invoke<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> >((anonymous namespace)::HeapTrack::LockedData::<lambda()> &&) (__fn=...) at /usr/include/c++/8.2.1/bits/invoke.h:95 #9 0x00007ffff7fb4905 in std::thread::_Invoker<std::tuple<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x55555555a548) at /usr/include/c++/8.2.1/thread:244 #10 0x00007ffff7fb48b6 in std::thread::_Invoker<std::tuple<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> > >::operator()(void) ( this=0x55555555a548) at /usr/include/c++/8.2.1/thread:253 #11 0x00007ffff7fb487b in std::thread::_State_impl<std::thread::_Invoker<std::tuple<(anonymous namespace)::HeapTrack::LockedData::LockedData(int, heaptrack_callback_t)::<lambda()> > > >::_M_run(void) (this=0x55555555a540) at /usr/include/c++/8.2.1/thread:196 #12 0x00007ffff7c5a063 in std::execute_native_thread_routine (__p=0x55555555a540) at /build/gcc/src/gcc/libstdc++-v3/src/c++11/thread.cc:80 #13 0x00007ffff7d63a9d in start_thread () from /usr/lib/libpthread.so.0 #14 0x00007ffff7e7db23 in clone () from /usr/lib/libc.so.6 Thread 1 (Thread 0x7ffff77d4dc0 (LWP 9256)): #0 0x00007ffff7d6cedc in __lll_lock_wait () from /usr/lib/libpthread.so.0 #1 0x00007ffff7d66298 in pthread_mutex_lock () from /usr/lib/libpthread.so.0 #2 0x00007ffff7fb4b90 in __gthread_mutex_lock (__mutex=0x7ffff7fcb8a0 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/8.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:748 #3 0x00007ffff7fb4d3b in std::mutex::lock (this=0x7ffff7fcb8a0 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/8.2.1/bits/std_mutex.h:103 #4 0x00007ffff7fb1bd8 in (anonymous namespace)::HeapTrack::HeapTrack (this=0x7fffffffbc4f) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:235 #5 0x00007ffff7fb3224 in heaptrack_malloc (ptr=0x55555556c460, size=1024) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:711 #6 0x00007ffff7fb0ac5 in malloc (size=1024) at /home/milian/projects/src/heaptrack/src/track/heaptrack_preload.cpp:164 #7 0x00007ffff7df2791 in _IO_file_doallocate () from /usr/lib/libc.so.6 #8 0x00007ffff7e00bc0 in _IO_doallocbuf () from /usr/lib/libc.so.6 #9 0x00007ffff7dffe48 in __GI__IO_file_overflow () from /usr/lib/libc.so.6 #10 0x00007ffff7dfef37 in __GI__IO_file_xsputn () from /usr/lib/libc.so.6 #11 0x00007ffff7dd2876 in vfprintf () from /usr/lib/libc.so.6 #12 0x00007ffff7e8cdf0 in __printf_chk () from /usr/lib/libc.so.6 #13 0x00005555555550ea in printf (__fmt=0x555555556004 "malloc: %p\n") at /usr/include/bits/stdio2.h:107 #14 main () at /home/milian/projects/src/heaptrack/tests/manual/test.c:33 Prevent this from happening by circumventing the buffered API and use raw open/read/lseek/close instead. scanf should still be fine to use, since it shouldn't allocate internally.
Return when we get asked for invalid data which seems to happen through the KDChart attributes model. Fixes assertions/crashes like: ``` #0 0x00007fdcfc88e83c in () at /usr/lib/libc.so.6 #1 0x00007fdcfc83e668 in raise () at /usr/lib/libc.so.6 #2 0x00007fdcfc8264b8 in abort () at /usr/lib/libc.so.6 #3 0x00007fdcfd8a0098 in qt_assert(char const*, char const*, int) () at /usr/lib/libQt5Core.so.5 #4 0x00007fdcfd8a0157 in () at /usr/lib/libQt5Core.so.5 #5 0x00005649da68cf3f in QVector<QBrush>::at(int) const (this=0x6070001170a0, i=-1) at /usr/include/qt/QtCore/qvector.h:449 #6 0x00005649da70272e in ChartModel::headerData(int, Qt::Orientation, int) const (this=0x607000117060, section=-1, orientation=Qt::Horizontal, role=175763350) at /home/milian/projects/src/heaptrack/src/analyze/gui/chartmodel.cpp:69 #7 0x00007fdcfda67f5e in QAbstractProxyModel::headerData(int, Qt::Orientation, int) const () at /usr/lib/libQt5Core.so.5 #8 0x00007fdcfda7ba54 in QSortFilterProxyModel::headerData(int, Qt::Orientation, int) const () at /usr/lib/libQt5Core.so.5 #9 0x00007fdcff9b6dbf in KChart::AttributesModel::headerData(int, Qt::Orientation, int) const () at /usr/lib/libKChart.so.2 #10 0x00007fdcff9af76a in () at /usr/lib/libKChart.so.2 #11 0x00007fdcff9a7605 in KChart::AbstractDiagram::brush(int) const () at /usr/lib/libKChart.so.2 #12 0x00007fdcff9ac495 in KChart::AbstractDiagram::datasetBrushes() const () at /usr/lib/libKChart.so.2 #13 0x00007fdcff9ccd77 in () at /usr/lib/libKChart.so.2 #14 0x00007fdcff9cda4b in KChart::Legend::buildLegend() () at /usr/lib/libKChart.so.2 #15 0x00007fdcff9ce4ba in KChart::Legend::setNeedRebuild() () at /usr/lib/libKChart.so.2 #16 0x00007fdcfdad17f3 in () at /usr/lib/libQt5Core.so.5 #17 0x00007fdcff98c4f3 in KChart::DiagramObserver::diagramDataChanged(KChart::AbstractDiagram*) () at /usr/lib/libKChart.so.2 #18 0x00007fdcfdad17f3 in () at /usr/lib/libQt5Core.so.5 #19 0x00007fdcfda59182 in QAbstractItemModel::modelReset(QAbstractItemModel::QPrivateSignal) () at /usr/lib/libQt5Core.so.5 #20 0x00007fdcfda8506d in () at /usr/lib/libQt5Core.so.5 #21 0x00007fdcfdad17f3 in () at /usr/lib/libQt5Core.so.5 #22 0x00007fdcfda59182 in QAbstractItemModel::modelReset(QAbstractItemModel::QPrivateSignal) () at /usr/lib/libQt5Core.so.5 #23 0x00005649da708068 in ChartModel::resetData(ChartData const&) (this=0x607000117060, data=...) at /home/milian/projects/src/heaptrack/src/analyze/gui/chartmodel.cpp:261 #24 0x00005649da569a5d in operator()(ChartData const&) const (__closure=0x604000258e20, data=...) at /home/milian/projects/src/heaptrack/src/analyze/gui/mainwindow.cpp:223 ``` BUG: 473634
When shutdown in one thread first locks the mutex, and the timer thread triggers a memory allocation, it would trigger a deadlock such as: ``` Thread 2 (Thread 0x7ffff3aeb6c0 (LWP 30899) "tst_inject"): #0 0x00007ffff6889700 in ?? () from /usr/lib/libc.so.6 #1 0x00007ffff688fec2 in pthread_mutex_lock () from /usr/lib/libc.so.6 #2 0x00007ffff3b8f089 in __gthread_mutex_lock (__mutex=0x7ffff3bffc00 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/13.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:749 #3 0x00007ffff3b8f27e in std::mutex::lock (this=0x7ffff3bffc00 <(anonymous namespace)::HeapTrack::s_lock>) at /usr/include/c++/13.2.1/bits/std_mutex.h:113 #4 0x00007ffff3b84416 in (anonymous namespace)::HeapTrack::HeapTrack (this=0x7ffff3aea6e0) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:263 #5 0x00007ffff3b8940b in heaptrack_realloc_impl (ptr_in=0x0, size=32, ptr_out=0x603000001810) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:804 #6 0x00007ffff3b89f06 in heaptrack_realloc2 (ptr_in=0, size=32, ptr_out=105759274702864) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:886 #7 0x00007ffff3b6c48c in (anonymous namespace)::hooks::realloc::hook (ptr=0x0, size=32) at /home/milian/projects/src/heaptrack/src/track/heaptrack_inject.cpp:140 #8 0x00007ffff688de8b in pthread_getattr_np () from /usr/lib/libc.so.6 #9 0x00007ffff79015a1 in __sanitizer::GetThreadStackTopAndBottom (at_initialization=at_initialization@entry=false, stack_top=stack_top@entry=0x7ffff3aeabd8, stack_bottom=stack_bottom@entry=0x7ffff3aeabe0) at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #10 0x00007ffff79017ca in __sanitizer::GetThreadStackAndTls (main=false, stk_addr=0x7ffff7705020, stk_size=0x7ffff3aeac30, tls_addr=0x7ffff7705040, tls_size=0x7ffff3aeac28) at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598 #11 0x00007ffff78f0ff4 in __asan::AsanThread::SetThreadStackAndTls (this=this@entry=0x7ffff7705000, options=<optimized out>) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.h:77 #12 0x00007ffff78f14ee in __asan::AsanThread::Init (this=this@entry=0x7ffff7705000, options=options@entry=0x0) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.cpp:234 #13 0x00007ffff78f19e5 in __asan::AsanThread::ThreadStart (this=0x7ffff7705000, os_id=30899) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.cpp:264 #14 0x00007ffff688c9eb in ?? () from /usr/lib/libc.so.6 #15 0x00007ffff6910dfc in ?? () from /usr/lib/libc.so.6 Thread 1 (Thread 0x7ffff7f19980 (LWP 30896) "tst_inject"): #0 0x00007ffff68894ae in ?? () from /usr/lib/libc.so.6 #1 0x00007ffff688e5f3 in ?? () from /usr/lib/libc.so.6 #2 0x00007ffff74e19b8 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>) at /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:669 #3 std::thread::join (this=0x608000000270) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:134 #4 0x00007ffff3b88dde in (anonymous namespace)::HeapTrack::LockedData::~LockedData (this=0x608000000220, __in_chrg=<optimized out>) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:739 #5 0x00007ffff3b84c08 in (anonymous namespace)::HeapTrack::shutdown (this=0x7ffff4916540) at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:359 --Type <RET> for more, q to quit, c to continue without paging-- #6 0x00007ffff3b8984e in heaptrack_stop () at /home/milian/projects/src/heaptrack/src/track/libheaptrack.cpp:839 #7 0x0000555555799eb7 in (anonymous namespace)::runInjectTest<DOCTEST_ANON_FUNC_14()::<lambda()>, DOCTEST_ANON_FUNC_14()::<lambda(void*)> >(struct {...}, struct {...}) (load=..., unload=...) at /home/milian/projects/src/heaptrack/tests/auto/no_asan/tst_inject.cpp:64 #8 0x00005555557938ba in DOCTEST_ANON_FUNC_14 () at /home/milian/projects/src/heaptrack/tests/auto/no_asan/tst_inject.cpp:80 #9 0x000055555578fb16 in doctest::Context::run (this=0x7ffff4800420) at /home/milian/projects/src/heaptrack/tests/auto/../../3rdparty/doctest.h:7013 #10 0x00005555557935ff in main (argc=1, argv=0x7fffffffc628) at /home/milian/projects/src/heaptrack/tests/auto/../../3rdparty/doctest.h:7091 (gdb) ``` To prevent this from happening, we now give up on trying to lock the mutex when we are shutting down.
When we call dl_iterate_phdr we need to have the heaptrack mutex locked, otherwise we can get a AB/BA deadlock like this: One thread is trying to update the module cache: ``` Thread 4 (Thread 0xffffa8e1ef40 (LWP 10016) "system-ui-execu"): #0 futex_wait (private=0, expected=2, futex_word=0xffffb13bcad0 <_rtld_global+2744>) at ../sysdeps/nptl/futex-internal.h:146 #1 __GI___lll_lock_wait (futex=futex@entry=0xffffb13bcad0 <_rtld_global+2744>, private=private@entry=0) at lowlevellock.c:49 #2 0x0000ffffae7437b0 in lll_mutex_lock_optimized (mutex=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:48 #3 ___pthread_mutex_lock (mutex=mutex@entry=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:128 #4 0x0000ffffae7ed6ec in __GI___dl_iterate_phdr (callback=0xffffb1366d10 <(anonymous namespace)::HeapTrack::dl_iterate_phdr_callback(dl_phdr_info*, size_t, void*)>, data=0xffffa8e1e2d0) at dl-iteratephdr.c:39 #5 0x0000ffffb136982c in (anonymous namespace)::HeapTrack::updateModuleCache (this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:636 #6 (anonymous namespace)::HeapTrack::handleMalloc (this=this@entry=0xffffa8e1e2d0, ptr=ptr@entry=0xffff9c000b70, size=size@entry=16, trace=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:527 #7 0x0000ffffb13699f8 in (anonymous namespace)::HeapTrack::handleMalloc (trace=..., size=16, ptr=0xffff9c000b70, this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/util/linewriter.h:271 #8 operator() (heaptrack=..., __closure=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #9 (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:273 #10 heaptrack_malloc (ptr=0xffff9c000b70, size=16) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #11 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120 #12 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6 ``` The other one is trying to overwrite symbols: ``` Thread 1 (Thread 0xffffb0a8d020 (LWP 10011) "system-ui-execu"): #0 0x0000ffffae774560 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, req=0xfffff3152150, rem=0xfffff3152150) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48 #1 0x0000ffffae779520 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25 #2 0x0000ffffb13699b4 in std::this_thread::sleep_for<long, std::ratio<1l, 1000000l> > (__rtime=..., __rtime=...) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/this_thread_sleep.h:82 #3 (anonymous namespace)::HeapTrack::tryLock<(anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> >(const (anonymous namespace)::RecursionGuard&, const heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)>&)::<lambda()> > (stopLockCheck=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:676 #4 (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:265 #5 heaptrack_malloc (ptr=0xaaaafeb07b10, size=55) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #6 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120 #7 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6 #8 0x0000ffffb1365464 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0xffffb13b5856 "", __beg=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.tcc:219 #9 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.h:539 #10 (anonymous namespace)::cachedSymtabSize (path=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject--Type <RET> for more, q to quit, c to continue without paging-- .cpp:555 #11 (anonymous namespace)::iterate_phdrs (info=0xfffff31525d0, data=0x0) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:577 #12 0x0000ffffae7ed7f0 in __GI___dl_iterate_phdr (callback=0xffffb1364ea0 <(anonymous namespace)::iterate_phdrs(dl_phdr_info*, size_t, void*)>, data=0x0) at dl-iteratephdr.c:74 #13 0x0000ffffb1364870 in (anonymous namespace)::overwrite_symbols () at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:589 ```
When we call dl_iterate_phdr we need to have the heaptrack mutex locked, otherwise we can get a AB/BA deadlock like this: One thread is trying to update the module cache: ``` Thread 4 (Thread 0xffffa8e1ef40 (LWP 10016) "system-ui-execu"): #0 futex_wait (private=0, expected=2, futex_word=0xffffb13bcad0 <_rtld_global+2744>) at ../sysdeps/nptl/futex-internal.h:146 #1 __GI___lll_lock_wait (futex=futex@entry=0xffffb13bcad0 <_rtld_global+2744>, private=private@entry=0) at lowlevellock.c:49 #2 0x0000ffffae7437b0 in lll_mutex_lock_optimized (mutex=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:48 #3 ___pthread_mutex_lock (mutex=mutex@entry=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:128 #4 0x0000ffffae7ed6ec in __GI___dl_iterate_phdr (callback=0xffffb1366d10 <(anonymous namespace)::HeapTrack::dl_iterate_phdr_callback(dl_phdr_info*, size_t, void*)>, data=0xffffa8e1e2d0) at dl-iteratephdr.c:39 #5 0x0000ffffb136982c in (anonymous namespace)::HeapTrack::updateModuleCache (this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:636 #6 (anonymous namespace)::HeapTrack::handleMalloc (this=this@entry=0xffffa8e1e2d0, ptr=ptr@entry=0xffff9c000b70, size=size@entry=16, trace=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:527 #7 0x0000ffffb13699f8 in (anonymous namespace)::HeapTrack::handleMalloc (trace=..., size=16, ptr=0xffff9c000b70, this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/util/linewriter.h:271 #8 operator() (heaptrack=..., __closure=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #9 (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:273 #10 heaptrack_malloc (ptr=0xffff9c000b70, size=16) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #11 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120 #12 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6 ``` The other one is trying to overwrite symbols: ``` Thread 1 (Thread 0xffffb0a8d020 (LWP 10011) "system-ui-execu"): #0 0x0000ffffae774560 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, req=0xfffff3152150, rem=0xfffff3152150) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48 #1 0x0000ffffae779520 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25 #2 0x0000ffffb13699b4 in std::this_thread::sleep_for<long, std::ratio<1l, 1000000l> > (__rtime=..., __rtime=...) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/this_thread_sleep.h:82 #3 (anonymous namespace)::HeapTrack::tryLock<(anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> >(const (anonymous namespace)::RecursionGuard&, const heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)>&)::<lambda()> > (stopLockCheck=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:676 #4 (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:265 #5 heaptrack_malloc (ptr=0xaaaafeb07b10, size=55) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #6 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120 #7 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6 #8 0x0000ffffb1365464 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0xffffb13b5856 "", __beg=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.tcc:219 #9 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.h:539 #10 (anonymous namespace)::cachedSymtabSize (path=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject--Type <RET> for more, q to quit, c to continue without paging-- .cpp:555 #11 (anonymous namespace)::iterate_phdrs (info=0xfffff31525d0, data=0x0) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:577 #12 0x0000ffffae7ed7f0 in __GI___dl_iterate_phdr (callback=0xffffb1364ea0 <(anonymous namespace)::iterate_phdrs(dl_phdr_info*, size_t, void*)>, data=0x0) at dl-iteratephdr.c:74 #13 0x0000ffffb1364870 in (anonymous namespace)::overwrite_symbols () at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:589 ```
When we call dl_iterate_phdr we need to have the heaptrack mutex locked, otherwise we can get a AB/BA deadlock like this: One thread is trying to update the module cache: ``` Thread 4 (Thread 0xffffa8e1ef40 (LWP 10016) "system-ui-execu"): #0 futex_wait (private=0, expected=2, futex_word=0xffffb13bcad0 <_rtld_global+2744>) at ../sysdeps/nptl/futex-internal.h:146 #1 __GI___lll_lock_wait (futex=futex@entry=0xffffb13bcad0 <_rtld_global+2744>, private=private@entry=0) at lowlevellock.c:49 #2 0x0000ffffae7437b0 in lll_mutex_lock_optimized (mutex=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:48 #3 ___pthread_mutex_lock (mutex=mutex@entry=0xffffb13bcad0 <_rtld_global+2744>) at pthread_mutex_lock.c:128 #4 0x0000ffffae7ed6ec in __GI___dl_iterate_phdr (callback=0xffffb1366d10 <(anonymous namespace)::HeapTrack::dl_iterate_phdr_callback(dl_phdr_info*, size_t, void*)>, data=0xffffa8e1e2d0) at dl-iteratephdr.c:39 #5 0x0000ffffb136982c in (anonymous namespace)::HeapTrack::updateModuleCache (this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:636 #6 (anonymous namespace)::HeapTrack::handleMalloc (this=this@entry=0xffffa8e1e2d0, ptr=ptr@entry=0xffff9c000b70, size=size@entry=16, trace=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:527 #7 0x0000ffffb13699f8 in (anonymous namespace)::HeapTrack::handleMalloc (trace=..., size=16, ptr=0xffff9c000b70, this=0xffffa8e1e2d0) at /home/milian/projects/kde/src/heaptrack/src/util/linewriter.h:271 #8 operator() (heaptrack=..., __closure=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #9 (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:273 #10 heaptrack_malloc (ptr=0xffff9c000b70, size=16) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #11 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120 #12 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6 ``` The other one is trying to overwrite symbols: ``` Thread 1 (Thread 0xffffb0a8d020 (LWP 10011) "system-ui-execu"): #0 0x0000ffffae774560 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry=0, flags=flags@entry=0, req=0xfffff3152150, rem=0xfffff3152150) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48 #1 0x0000ffffae779520 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25 #2 0x0000ffffb13699b4 in std::this_thread::sleep_for<long, std::ratio<1l, 1000000l> > (__rtime=..., __rtime=...) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/this_thread_sleep.h:82 #3 (anonymous namespace)::HeapTrack::tryLock<(anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> >(const (anonymous namespace)::RecursionGuard&, const heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)>&)::<lambda()> > (stopLockCheck=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:676 #4 (anonymous namespace)::HeapTrack::op<heaptrack_malloc(void*, size_t)::<lambda((anonymous namespace)::HeapTrack&)> > (op=...) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:265 #5 heaptrack_malloc (ptr=0xaaaafeb07b10, size=55) at /home/milian/projects/kde/src/heaptrack/src/track/libheaptrack.cpp:880 #6 0x0000ffffb1364484 in (anonymous namespace)::hooks::malloc::hook (size=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:120 #7 0x0000ffffae942a50 in operator new(unsigned long) () from /usr/lib/libstdc++.so.6 #8 0x0000ffffb1365464 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*> (__end=0xffffb13b5856 "", __beg=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.tcc:219 #9 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__a=..., __s=0xffffb13b5820 "/mnt/appfs/nwot-appfs1/usr/lib/libutility-framework.so", this=0xfffff3152468) at /home/milian/projects/kdab/vorwerk/sdk/sysroots/cortexa53-crypto-veld-linux/usr/include/c++/11.4.0/bits/basic_string.h:539 #10 (anonymous namespace)::cachedSymtabSize (path=<optimized out>) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject--Type <RET> for more, q to quit, c to continue without paging-- .cpp:555 #11 (anonymous namespace)::iterate_phdrs (info=0xfffff31525d0, data=0x0) at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:577 #12 0x0000ffffae7ed7f0 in __GI___dl_iterate_phdr (callback=0xffffb1364ea0 <(anonymous namespace)::iterate_phdrs(dl_phdr_info*, size_t, void*)>, data=0x0) at dl-iteratephdr.c:74 #13 0x0000ffffb1364870 in (anonymous namespace)::overwrite_symbols () at /home/milian/projects/kde/src/heaptrack/src/track/heaptrack_inject.cpp:589 ```
Do not merge!
I've been using heaptrack every now and then, and while it's much faster than valgrind (which simply gives up), it takes over a day to run it on my case. So I spent some time to optimize it.
If improvements like this have been discussed and ruled out, please go ahead and close it. I imagine there may be subtle details which could invalidate this proposal
This pull request contains experimental changes to improve the performance of the allocation data serialization back to heaptrack interpreter. It's just a proof of concept, really.
There are two main changes here:
I used the following testcase for performance validation:
main.txt
This testcase is tailored for the two optimizations, so I wouldn't expect the same results on real scenarios. I did not test this on my testcase yet (don't have access to the company network on weekend :) )
This does a bunch of string manipulations and sleep every now and then. The sleep is to simulate time not doing allocations on real applications.
I will validate this patch on my 1 day testcase during this week, and if it gives significant improvement, I can work on separate it on two or three changesets, fix bugs, improve error handling and so on...
I created a pull request to kick off a discussion (if it hasn't been discussed yet), and gather feedback.