-
-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add image cache to avoid calling dladdr() and add libbacktrace elf image list refresh #674
Add image cache to avoid calling dladdr() and add libbacktrace elf image list refresh #674
Conversation
…esolution. This cache can also be used in the runtime symbol resolution case to detect we should recreate "backtrace_state" when new images have been loaded
Gave this a try and it seems to work very well! I am able to record GNOME Shell with offline symbol resolving without much extra lag seemingly. |
public/client/TracyCallstack.cpp
Outdated
} | ||
}; | ||
#endif //#ifdef TRACY_USE_IMAGE_CACHE | ||
|
||
namespace tracy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for keeping the above code outside the tracy
namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really, moved it under
public/client/TracyCallstack.cpp
Outdated
if( !entry ) | ||
{ | ||
//printf("* addr not found: %p (%d entries) refreshing (m_numberOfRefreshes: %d)\n", | ||
// address, m_images->size(), m_numberOfRefreshes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debug code or convert to TracyDebug()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
public/client/TracyCallstack.cpp
Outdated
|
||
if( it != m_images->end() && address < it->m_endAddress ) | ||
{ | ||
return &(*it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return it
?
CMakeLists.txt
Outdated
@@ -86,6 +86,7 @@ set_option(TRACY_NO_CRASH_HANDLER "Disable crash handling" OFF) | |||
set_option(TRACY_TIMER_FALLBACK "Use lower resolution timers" OFF) | |||
set_option(TRACE_CLIENT_LIBUNWIND_BACKTRACE "Use libunwind backtracing where supported" OFF) | |||
set_option(TRACY_SYMBOL_OFFLINE_RESOLVE "Instead of full runtime symbol resolution, only resolve the image path and offset to enable offline symbol resolution" OFF) | |||
set_option(TRACY_ENABLE_IMAGE_CACHE "On glibc platforms, when doing offline symbol resolution, use an cache to determine the image path and offset instead of dladdr()" OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for keeping this as an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description of that option should have included the runtime resolve case. For offline symbol resolving, having the image cache be used by default sounds ok to me.
I wasn't sure what to do with the runtime resolve case, where the image cache is used to detect new images were loaded and trigger libbacktrace state recreation. Maybe the option should be renamed something like "TRACY_ENABLE_LIBBACKTRACE_DYNAMIC_LOADED_IMAGES" and only control enabling that code for the runtime resolve case. It could then be disable by default as maybe not too many users needed (if they don't do dynamic loading of shared libs) and also knowing that it comes with libbacktrace state leaking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every program, even helloworldware, will do dynamic loading of shared libs. For example, when I put a breakpoint on elf_add
in libbacktrace in the tracy test application, I see hits on /proc/self/exe
, /usr/lib/libdebuginfod.so.1
, /usr/lib/libstdc++.so.6
, /usr/lib/libm.so.6
, /usr/lib/libgcc_s.so.1
, /usr/lib/libc.so.6
, /usr/lib/libcurl.so.4
, /usr/lib/libelf.so.1
, /lib64/ld-linux-x86-64.so.2
, and so on.
The call stack for each of these library loads is:
#0 tracy::elf_add (state=0x7fffee270080, filename=0x7fffee260120 "/usr/lib/libgcc_s.so.1", descriptor=232, memory=0x0, memory_size=0, base_address=140737350017024, error_callback=0x555555603dd0 <tracy::CallstackErrorCb(void*, char const*, int)>, data=0x0, fileline_fn=0x7fffef27e608, found_sym=0x7fffef27e754, found_dwarf=0x7fffef27e604, fileline_entry=0x0, exe=0, debuginfo=0, with_buildid_data=0x0, with_buildid_size=0) at ../public/libbacktrace/elf.cpp:6578
#1 0x000055555560f2a1 in tracy::phdr_callback (info=0x7fffee3c00d0, pdata=0x7fffef27e708) at ../public/libbacktrace/elf.cpp:7411
#2 0x000055555560c80c in tracy::backtrace_initialize (state=0x7fffee270080, filename=0x555555633d60 "/proc/self/exe", descriptor=232, error_callback=0x555555603dd0 <tracy::CallstackErrorCb(void*, char const*, int)>, data=0x0, fileline_fn=0x7fffef27e7f8) at ../public/libbacktrace/elf.cpp:7459
#3 0x0000555555609bc8 in tracy::fileline_initialize (state=0x7fffee270080, error_callback=0x555555603dd0 <tracy::CallstackErrorCb(void*, char const*, int)>, data=0x0) at ../public/libbacktrace/fileline.cpp:264
#4 0x000055555560340d in tracy::backtrace_pcinfo (state=0x7fffee270080, pc=93824992316766, callback=0x555555603780 <tracy::CallstackDataCb(void*, unsigned long, unsigned long, char const*, int, char const*)>, error_callback=0x555555603dd0 <tracy::CallstackErrorCb(void*, char const*, int)>, data=0x0) at ../public/libbacktrace/fileline.cpp:298
#5 0x00005555555f2eea in tracy::DecodeCallstackPtr (ptr=93824992316766) at ../public/client/TracyCallstack.cpp:1075
#6 0x00005555555ede59 in tracy::Profiler::HandleSymbolQueueItem (this=0x555555644700 <tracy::s_profiler>, si=...) at ../public/client/TracyProfiler.cpp:3266
#7 0x00005555555f4b3a in tracy::Profiler::SymbolWorker (this=0x555555644700 <tracy::s_profiler>) at ../public/client/TracyProfiler.cpp:3388
#8 0x000055555562afc5 in tracy::Profiler::LaunchSymbolWorker (ptr=0x555555644700 <tracy::s_profiler>) at ../public/common/../client/TracyProfiler.hpp:796
#9 0x000055555562c6ad in tracy::Thread::Launch (ptr=0x7ffff63600e0) at ../public/common/../client/TracyThread.hpp:80
Note that DecodeCallstackPtr
is on the stack here, so I'm not sure of your reasoning in that sentence:
This cache can also be used in the runtime symbol resolution case to detect we should recreate "backtrace_state" when new images have been loaded, as libbacktrace symbol resolution will only consider images loaded before the first symbol resolve operation.
Can you tell me why you think the libbacktrace state should be reset when a new library appears? The trace above show that the library should handle this out of the box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, that is strange. I'll unoptimize the code and do some stepping again in my case to make sure what I was seeing was actually correct (I'll hopefully get to it later today). But after the first symbol resolve was hit, I was missing symbol resolving for shared libraries loaded after that. Looking through the libbacktrace code I could see it was using backtrace_state::fileline_fn as the control value to already called backtrace_initialize() (which was what was responsible for the shared library indexing), and that was set to something non-null after the call. That doesn't appear to be the case for you though, maybe I misinterpreted the results in the debugger with optimized code and something else was causing the issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to preface this comment, by admitting that the code path I've been testing with out main app is still using a 0.9.1 client code branch with extra cherry-picks on top (we haven't been able to move all workflows to use 0.10 just yet), so it's not really the same test with the HEAD code you did (I'll try to do this with HEAD code tomorrow), however I don't see many changes on the libbacktrace code that would have changed this behavior.
I only see it hit fileline_initialize then backtrace_initialize once on the first resolve, because it will then set state->fileline_fn at the end of fileline_initialize() and that will prevent it from calling backtrace_initialize on any subsequent calls.
Maybe you recall if something was changed between 0.9.1 and HEAD that would have allowed backtrace_initialize be called multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as I understand it, the problem you have is that shared objects that are dlopened at a later time are not visible because libbacktrace has already collected a list of shared objects that the executable is linking to and will not look for more
exactly.
This feels like something libbacktrace should handle itself (in an ideal world).
I guess before I started looking more closely at the code I didn't realize you had already modified that libbacktrace code quite a bit and it wasn't just a dump from upstream.
you already know when you want to reset libbacktrace state. Have you tried running dl_iterate_phdr to call phdr_callback instead?
Yes, libacktrace code is already doing pretty much what I was doing once, so I can try to expose an update method I can call from outside - I'll try to do this this week and update this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess before I started looking more closely at the code I didn't realize you had already modified that libbacktrace code quite a bit and it wasn't just a dump from upstream.
There have been some changes that add some sorely missing features, but there have been no substantial changes. The changes to dl_iterate_phdr
are technical: 7e8961d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, libacktrace code is already doing pretty much what I was doing once, so I can try to expose an update method I can call from outside - I'll try to do this this week and update this PR.
Please check if it would be feasible to implement this update functionality within libbacktrace itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to dl_iterate_phdr are technical: 7e8961d
I think this was needed because elf_add
can now call debuginfod, which might issue network requests, and this didn't play well with the requirements of dl_iterate_phdr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check if it would be feasible to implement this update functionality within libbacktrace itself.
I made an attempt at it in the latest commit
…= 3 to obtain image path and addreses instead of dladdr()
… loaded after backtrace_initialize() has been called, and consider them for symbol resolution
public/libbacktrace/dwarf.cpp
Outdated
return ret; | ||
} | ||
|
||
// if we failed to obtain an entry in range, it can mean that the address map has been cahnges and new entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file is marked as changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry line endings shenanigans - should be fixed now.
The indentation in those libbacktrace files is also somewhat "special"
} | ||
} | ||
image->m_name = cache->m_imageName; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is image->m_name
valid after this function returns? My guess would be that the info
argument is only valid inside this function, and the dlInfo
provided filename will be invalidated after the next call to dladdr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point - fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this memory released?
public/client/TracyCallstack.cpp
Outdated
[]( const ImageEntry& lhs, const ImageEntry& rhs ) { return lhs.m_startAddress > rhs.m_startAddress; } ); | ||
} | ||
|
||
const ImageEntry* GetImageEntryForAddress( void* address ) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to GetImageForAddressImpl
.
const ImageEntry* entry = GetImageEntryForAddress( address ); | ||
if( !entry ) | ||
{ | ||
Refresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a mechanism that will prevent constant reloads when a large set of unmapped addresses is queried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't one atm. The expectation would be that it wouldn't actually be called that often in real scenarios. In my case with a few hundred dynamic loaded shared libs, and instrumenting all memory allocations with a callstack captures I only get it called ~5 times until it actually has the full map as symbol resolution is relatively slow in itself.
What would be your suggestion, a "min ms between refreshes" and either skip (and return unknown symbols for that period) or "stall" the resolution thread for a period or time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stalling to allow batch processing of a larger set may be a solution. But I was more interested in whether you have considered this case. I don't think this is immediately actionable.
public/libbacktrace/elf.cpp
Outdated
@@ -7366,6 +7393,12 @@ phdr_callback_mock (struct dl_phdr_info *info, size_t size ATTRIBUTE_UNUSED, | |||
} | |||
else ptr->dlpi_name = nullptr; | |||
ptr->dlpi_addr = info->dlpi_addr; | |||
|
|||
// calculate the address range so we can quickly determine is a PC is within the range of this image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo.
public/client/TracyCallstack.cpp
Outdated
// so we must get it in an alternative way and cache it | ||
if( info->dlpi_name && info->dlpi_name[0] != '\0' ) | ||
{ | ||
image->m_name = info->dlpi_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this assignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point - copied it there as well.
I now free and rebuild images names with every refresh for simplicity - would probably be better if I just recreated the new image entries, although it would require searching and comparing the paths to be sure so not sure it's worth it.
Okay, I get this without dynload too, so I'll make an issue instead. |
Made an issue at #684; I somewhat suspect the latter changes in this PR though, because when I tested it in the beginning it seemed to work fine. |
On glibc platforms, add an image cache to avoid calling dladdr() when doing offline symbol resolution, as dladdr() can cause contention with other dl* calls (see: #665 (comment)).
Add support for libbacktrace to detect new elfs have been dynamically loaded after backtrace_initialize() has been called, and consider them for further symbol resolution.