-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[lldb] fix parallel module loading deadlock for Linux DYLD #166480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
I wonder if instead of a boolean we should pass down a function that schedules the module loading as suggested in swiftlang#10746. That gives more flexibility in that we could immediately schedule the preload to happen in parallel (as done in the other PR), or for your use case we could accumulate the preload callbacks into a vector and then schedule all the preload jobs to run in parallel once all the modules have been loaded. Any thoughts @DmT021? |
(Proper link would be swiftlang#10746. #10746 in this repo refers to something unrelated) |
|
@llvm/pr-subscribers-lldb Author: Tom Yang (zhyty) ChangesAnother attempt at resolving the deadlock issue @GeorgeHuyubo discovered (his previous attempt). This change can be summarized as the following:
These changes roughly maintain the performance characteristics of the previous implementation of parallel module loading. Testing on targets with between 5000 and 14000 modules, I saw similar numbers as before, often more than 10% faster in the new implementation across multiple trials for these massive targets. I think it's because we have less lock contention with this approach. The deadlockAs @GeorgeHuyubo explains in his PR, the deadlock occurs from an ABBA deadlock that happens when a thread context-switches out of
A competing thread may execute Reproducing the deadlockIt might be worth noting that I've never been able to observe this deadlock issue during live debugging (e.g. launching or attaching to processes), however we were able to consistently reproduce this issue with coredumps when using the following settings: How this change avoids this deadlockThis change avoids concurrent executions of Why not read-write lock?Some feedback in #160225 was to modify mutexes used in these components with read-write locks. This might be a good idea overall, but I don't think it would easily resolve this specific deadlock. Perhaps with deeper architectural changes, we could fix this issue? Other attemptsOne downside of this approach (and the former approach of parallel module loading) is that each DYLD would need to implement this pattern themselves. With @clayborg's help, I looked at a few other approaches:
Test Plan:Full diff: https://github.com/llvm/llvm-project/pull/166480.diff 10 Files Affected:
diff --git a/lldb/include/lldb/Core/ModuleList.h b/lldb/include/lldb/Core/ModuleList.h
index e71f3b2bad6b4..8e87a2cb01df6 100644
--- a/lldb/include/lldb/Core/ModuleList.h
+++ b/lldb/include/lldb/Core/ModuleList.h
@@ -511,6 +511,9 @@ class ModuleList {
/// Atomically swaps the contents of this module list with \a other.
void Swap(ModuleList &other);
+ /// For each module in this ModuleList, preload its symbols.
+ void PreloadSymbols() const;
+
protected:
// Class typedefs.
typedef std::vector<lldb::ModuleSP>
diff --git a/lldb/include/lldb/Target/DynamicLoader.h b/lldb/include/lldb/Target/DynamicLoader.h
index 75bb6cb6bb907..271d5f761696c 100644
--- a/lldb/include/lldb/Target/DynamicLoader.h
+++ b/lldb/include/lldb/Target/DynamicLoader.h
@@ -207,10 +207,17 @@ class DynamicLoader : public PluginInterface {
/// resulting module at the virtual base address \p base_addr.
/// Note that this calls Target::GetOrCreateModule with notify being false,
/// so it is necessary to call Target::ModulesDidLoad afterwards.
+ ///
+ /// \param[in] defer_module_preload
+ /// If the module needs to be loaded, prevent the module from being
+ /// preloaded even if the user sets the preload symbols option. This is
+ /// used as a performance optimization should the caller preload the
+ /// modules in parallel after calling this function.
virtual lldb::ModuleSP LoadModuleAtAddress(const lldb_private::FileSpec &file,
lldb::addr_t link_map_addr,
lldb::addr_t base_addr,
- bool base_addr_is_offset);
+ bool base_addr_is_offset,
+ bool defer_module_preload = false);
/// Find/load a binary into lldb given a UUID and the address where it is
/// loaded in memory, or a slide to be applied to the file address.
@@ -352,7 +359,13 @@ class DynamicLoader : public PluginInterface {
protected:
// Utility methods for derived classes
- lldb::ModuleSP FindModuleViaTarget(const FileSpec &file);
+ /// Find a module in the target that matches the given file.
+ ///
+ /// \param[in] defer_module_preload
+ /// If the module needs to be loaded, prevent the module from being
+ /// preloaded even if the user sets the preload symbols option.
+ lldb::ModuleSP FindModuleViaTarget(const FileSpec &file,
+ bool defer_module_preload = false);
/// Checks to see if the target module has changed, updates the target
/// accordingly and returns the target executable module.
diff --git a/lldb/include/lldb/Target/Target.h b/lldb/include/lldb/Target/Target.h
index c375df248154f..af755508cbe74 100644
--- a/lldb/include/lldb/Target/Target.h
+++ b/lldb/include/lldb/Target/Target.h
@@ -649,12 +649,20 @@ class Target : public std::enable_shared_from_this<Target>,
/// will handle / summarize the failures in a custom way and
/// don't use these messages.
///
+ /// \param[in] defer_preload
+ /// If true, the module will not be preloaded even if
+ /// Target::GetPreloadSymbols() is true. This is useful when the caller
+ /// wishes to preload loaded modules in parallel after calling this
+ /// function for better performance. This is because it's currently not
+ /// thread-safe to do so during the execution of this function.
+ ///
/// \return
/// An empty ModuleSP will be returned if no matching file
/// was found. If error_ptr was non-nullptr, an error message
/// will likely be provided.
lldb::ModuleSP GetOrCreateModule(const ModuleSpec &module_spec, bool notify,
- Status *error_ptr = nullptr);
+ Status *error_ptr = nullptr,
+ bool defer_preload = false);
// Settings accessors
diff --git a/lldb/source/Core/DynamicLoader.cpp b/lldb/source/Core/DynamicLoader.cpp
index 7580b15c02ce1..d1f2ae33af75e 100644
--- a/lldb/source/Core/DynamicLoader.cpp
+++ b/lldb/source/Core/DynamicLoader.cpp
@@ -154,7 +154,8 @@ DynamicLoader::GetSectionListFromModule(const ModuleSP module) const {
return sections;
}
-ModuleSP DynamicLoader::FindModuleViaTarget(const FileSpec &file) {
+ModuleSP DynamicLoader::FindModuleViaTarget(const FileSpec &file,
+ bool defer_module_preload) {
Target &target = m_process->GetTarget();
ModuleSpec module_spec(file, target.GetArchitecture());
if (UUID uuid = m_process->FindModuleUUID(file.GetPath())) {
@@ -165,7 +166,9 @@ ModuleSP DynamicLoader::FindModuleViaTarget(const FileSpec &file) {
if (ModuleSP module_sp = target.GetImages().FindFirstModule(module_spec))
return module_sp;
- if (ModuleSP module_sp = target.GetOrCreateModule(module_spec, false))
+ if (ModuleSP module_sp = target.GetOrCreateModule(
+ module_spec, false /* notify */, nullptr /* error_ptr */,
+ defer_module_preload))
return module_sp;
return nullptr;
@@ -174,8 +177,9 @@ ModuleSP DynamicLoader::FindModuleViaTarget(const FileSpec &file) {
ModuleSP DynamicLoader::LoadModuleAtAddress(const FileSpec &file,
addr_t link_map_addr,
addr_t base_addr,
- bool base_addr_is_offset) {
- if (ModuleSP module_sp = FindModuleViaTarget(file)) {
+ bool base_addr_is_offset,
+ bool defer_module_preload) {
+ if (ModuleSP module_sp = FindModuleViaTarget(file, defer_module_preload)) {
UpdateLoadedSections(module_sp, link_map_addr, base_addr,
base_addr_is_offset);
return module_sp;
diff --git a/lldb/source/Core/ModuleList.cpp b/lldb/source/Core/ModuleList.cpp
index c40612c1ced5e..c8594b4aa803a 100644
--- a/lldb/source/Core/ModuleList.cpp
+++ b/lldb/source/Core/ModuleList.cpp
@@ -7,6 +7,7 @@
//===----------------------------------------------------------------------===//
#include "lldb/Core/ModuleList.h"
+#include "lldb/Core/Debugger.h"
#include "lldb/Core/Module.h"
#include "lldb/Core/ModuleSpec.h"
#include "lldb/Core/PluginManager.h"
@@ -26,6 +27,7 @@
#include "lldb/Utility/Log.h"
#include "lldb/Utility/UUID.h"
#include "lldb/lldb-defines.h"
+#include "llvm/Support/ThreadPool.h"
#if defined(_WIN32)
#include "lldb/Host/windows/PosixApi.h"
@@ -1357,3 +1359,14 @@ void ModuleList::Swap(ModuleList &other) {
m_modules_mutex, other.m_modules_mutex);
m_modules.swap(other.m_modules);
}
+
+void ModuleList::PreloadSymbols() const {
+ std::lock_guard<std::recursive_mutex> guard(m_modules_mutex);
+ llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
+ for (const ModuleSP &module_sp : m_modules)
+ task_group.async([module_sp] {
+ if (module_sp)
+ module_sp->PreloadSymbols();
+ });
+ // task group destructor waits for all tasks to complete
+}
diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
index 326b6910b5267..2ca3f1eb6e27a 100644
--- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
+++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
@@ -453,7 +453,8 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
// exists for the duration of this call in `m_rendezvous`.
auto load_module_fn =
[this, &loaded_modules, &new_modules,
- &interpreter_module_mutex](const DYLDRendezvous::SOEntry &so_entry) {
+ &interpreter_module_mutex](const DYLDRendezvous::SOEntry &so_entry,
+ bool defer_module_preload) {
// Don't load a duplicate copy of ld.so if we have already loaded it
// earlier in LoadInterpreterModule. If we instead loaded then
// unloaded it later, the section information for ld.so would be
@@ -469,7 +470,8 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
}
ModuleSP module_sp = LoadModuleAtAddress(
- so_entry.file_spec, so_entry.link_addr, so_entry.base_addr, true);
+ so_entry.file_spec, so_entry.link_addr, so_entry.base_addr,
+ true /* base_addr_is_offset */, defer_module_preload);
if (!module_sp.get())
return;
@@ -503,11 +505,14 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() {
if (m_process->GetTarget().GetParallelModuleLoad()) {
llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
for (; I != E; ++I)
- task_group.async(load_module_fn, *I);
+ task_group.async(load_module_fn, *I, true /* defer_module_preload */);
task_group.wait();
+
+ if (m_process->GetTarget().GetPreloadSymbols())
+ new_modules.PreloadSymbols();
} else {
for (; I != E; ++I)
- load_module_fn(*I);
+ load_module_fn(*I, false /* defer_module_preload */);
}
m_process->GetTarget().ModulesDidLoad(new_modules);
@@ -644,12 +649,12 @@ ModuleSP DynamicLoaderPOSIXDYLD::LoadInterpreterModule() {
return nullptr;
}
-ModuleSP DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(const FileSpec &file,
- addr_t link_map_addr,
- addr_t base_addr,
- bool base_addr_is_offset) {
+ModuleSP DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(
+ const FileSpec &file, addr_t link_map_addr, addr_t base_addr,
+ bool base_addr_is_offset, bool defer_module_preload) {
if (ModuleSP module_sp = DynamicLoader::LoadModuleAtAddress(
- file, link_map_addr, base_addr, base_addr_is_offset))
+ file, link_map_addr, base_addr, base_addr_is_offset,
+ defer_module_preload))
return module_sp;
// This works around an dynamic linker "bug" on android <= 23, where the
@@ -668,7 +673,7 @@ ModuleSP DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(const FileSpec &file,
!(memory_info.GetName().IsEmpty())) {
if (ModuleSP module_sp = DynamicLoader::LoadModuleAtAddress(
FileSpec(memory_info.GetName().GetStringRef()), link_map_addr,
- base_addr, base_addr_is_offset))
+ base_addr, base_addr_is_offset, defer_module_preload))
return module_sp;
}
}
@@ -704,9 +709,11 @@ void DynamicLoaderPOSIXDYLD::LoadAllCurrentModules() {
module_names, m_process->GetTarget().GetArchitecture().GetTriple());
auto load_module_fn = [this, &module_list,
- &log](const DYLDRendezvous::SOEntry &so_entry) {
- ModuleSP module_sp = LoadModuleAtAddress(
- so_entry.file_spec, so_entry.link_addr, so_entry.base_addr, true);
+ &log](const DYLDRendezvous::SOEntry &so_entry,
+ bool defer_module_preload) {
+ ModuleSP module_sp =
+ LoadModuleAtAddress(so_entry.file_spec, so_entry.link_addr,
+ so_entry.base_addr, true, defer_module_preload);
if (module_sp.get()) {
LLDB_LOG(log, "LoadAllCurrentModules loading module: {0}",
so_entry.file_spec.GetFilename());
@@ -723,11 +730,14 @@ void DynamicLoaderPOSIXDYLD::LoadAllCurrentModules() {
if (m_process->GetTarget().GetParallelModuleLoad()) {
llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I)
- task_group.async(load_module_fn, *I);
+ task_group.async(load_module_fn, *I, true /* defer_module_preload */);
task_group.wait();
+
+ if (m_process->GetTarget().GetPreloadSymbols())
+ module_list.PreloadSymbols();
} else {
for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I) {
- load_module_fn(*I);
+ load_module_fn(*I, false /* defer_module_preload */);
}
}
diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h
index 6efb92673a13c..2d9dba9bca9fa 100644
--- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h
+++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h
@@ -55,10 +55,11 @@ class DynamicLoaderPOSIXDYLD : public lldb_private::DynamicLoader {
// PluginInterface protocol
llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }
- lldb::ModuleSP LoadModuleAtAddress(const lldb_private::FileSpec &file,
- lldb::addr_t link_map_addr,
- lldb::addr_t base_addr,
- bool base_addr_is_offset) override;
+ lldb::ModuleSP
+ LoadModuleAtAddress(const lldb_private::FileSpec &file,
+ lldb::addr_t link_map_addr, lldb::addr_t base_addr,
+ bool base_addr_is_offset,
+ bool defer_module_preload = false) override;
void CalculateDynamicSaveCoreRanges(
lldb_private::Process &process,
diff --git a/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.cpp b/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.cpp
index d019415cb67a6..09cde464ccdc8 100644
--- a/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.cpp
+++ b/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.cpp
@@ -67,9 +67,11 @@ ThreadPlanSP DynamicLoaderWasmDYLD::GetStepThroughTrampolinePlan(Thread &thread,
lldb::ModuleSP DynamicLoaderWasmDYLD::LoadModuleAtAddress(
const lldb_private::FileSpec &file, lldb::addr_t link_map_addr,
- lldb::addr_t base_addr, bool base_addr_is_offset) {
+ lldb::addr_t base_addr, bool base_addr_is_offset,
+ bool defer_module_preload) {
if (ModuleSP module_sp = DynamicLoader::LoadModuleAtAddress(
- file, link_map_addr, base_addr, base_addr_is_offset))
+ file, link_map_addr, base_addr, base_addr_is_offset,
+ defer_module_preload))
return module_sp;
if (ModuleSP module_sp = m_process->ReadModuleFromMemory(file, base_addr)) {
diff --git a/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.h b/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.h
index 5ed855395cca7..9c13bdff0279a 100644
--- a/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.h
+++ b/lldb/source/Plugins/DynamicLoader/wasm-DYLD/DynamicLoaderWasmDYLD.h
@@ -33,10 +33,11 @@ class DynamicLoaderWasmDYLD : public DynamicLoader {
Status CanLoadImage() override { return Status(); }
lldb::ThreadPlanSP GetStepThroughTrampolinePlan(Thread &thread,
bool stop) override;
- lldb::ModuleSP LoadModuleAtAddress(const lldb_private::FileSpec &file,
- lldb::addr_t link_map_addr,
- lldb::addr_t base_addr,
- bool base_addr_is_offset) override;
+ lldb::ModuleSP
+ LoadModuleAtAddress(const lldb_private::FileSpec &file,
+ lldb::addr_t link_map_addr, lldb::addr_t base_addr,
+ bool base_addr_is_offset,
+ bool defer_module_preload = false) override;
/// \}
diff --git a/lldb/source/Target/Target.cpp b/lldb/source/Target/Target.cpp
index d070c3d953d4a..ea685992043e7 100644
--- a/lldb/source/Target/Target.cpp
+++ b/lldb/source/Target/Target.cpp
@@ -2342,7 +2342,8 @@ bool Target::ReadPointerFromMemory(const Address &addr, Status &error,
}
ModuleSP Target::GetOrCreateModule(const ModuleSpec &orig_module_spec,
- bool notify, Status *error_ptr) {
+ bool notify, Status *error_ptr,
+ bool defer_preload) {
ModuleSP module_sp;
Status error;
@@ -2406,7 +2407,7 @@ ModuleSP Target::GetOrCreateModule(const ModuleSpec &orig_module_spec,
module_spec.GetFileSpec().GetDirectory(), transformed_dir)) {
transformed_spec.GetFileSpec().SetDirectory(transformed_dir);
transformed_spec.GetFileSpec().SetFilename(
- module_spec.GetFileSpec().GetFilename());
+ module_spec.GetFileSpec().GetFilename());
error = ModuleList::GetSharedModule(transformed_spec, module_sp,
&search_paths, &old_modules,
&did_create_module);
@@ -2510,8 +2511,9 @@ ModuleSP Target::GetOrCreateModule(const ModuleSpec &orig_module_spec,
// Preload symbols outside of any lock, so hopefully we can do this for
// each library in parallel.
- if (GetPreloadSymbols())
+ if (GetPreloadSymbols() && !defer_preload)
module_sp->PreloadSymbols();
+
llvm::SmallVector<ModuleSP, 1> replaced_modules;
for (ModuleSP &old_module_sp : old_modules) {
if (m_images.GetIndexForModule(old_module_sp.get()) !=
|
I just finished writing a small blurb explaining why I went with this approach, so maybe that helps justify this approach over that one?
Immediately scheduling the preload to happen in parallel shouldn't be thread-safe due to the ABBA deadlock. I don't think that flexibility is good, then, if it leads callers into this deadlock. |
|
Since you can easily reproduce the deadlock, can you include a backtrace to help me better understand the call stack. I'm particularly interested in understanding why At first glance, the current solution feels a bit ad-hoc, especially because Lastly, adopting a read-write lock in the module list would benefit more than just resolving this deadlock, so I would really like us to explore that option first. The problem here is again that we are calling |
Just included @GeorgeHuyubo's backtrace in the description bt.txt. It's not that
So,
Yeah, I think that would be fine as well. I am probably missing a nice clean way of writing this solution out :/. Any ideas how we would defer this work unconditionally? Should I refactor I think we would actually need to change at least the Darwin DYLD as well since this could theoretically take place for it too (though I haven't reproduced on Mac yet).
That's true, and I think would benefit the overall design of the module list! I think the part that I don't agree on is that it would do anything to resolve this deadlock -- looking at the execution, would we not run into the same deadlock due to lock contention between readers and writers? |
|
Thanks for the backtrace and the clarification. I now also finally understand why you mentioned "context switching", which confused me as that's not something the LLVM ThreadPool supports. What's going on is that one of the worker threads is submitting new tasks (belonging to a different task group) and waiting on them. From https://llvm.org/doxygen/classllvm_1_1ThreadPoolInterface.html:
However that last statement doesn't account for the situation where we're in. IIUC, we have a task group X and a task group Y. We create task group Y from a task belonging to X, and when we call wait, the thread pool picks up the next task from X, not from Y. This is what you referred to as "context switching". It also explains why @GeorgeHuyubo's switch to a separate ThreadPool avoided the issue. With that information, I would argue that the actual mutexes involved here are irrelevant. Nothing can make this operation inherently safe. Even without any mutexes, with a sufficiently small group of threads, this may never complete. The only solution here is to not create a new task pool from a task. Your patch is doing exactly that by deferring the preload until after we're processing the modules in parallel. Alright, I've convinced myself that this is the right approach. Thanks for bearing with me. I'm now also convinced that we should do this unconditionally. The only reason we're not seeing this on Darwin is because we always use the accelerator tables and don't need to manually index. However, it inherently suffers from the same problem. Let's change this globally and we should be good everywhere. |
|
@JDevlieghere no worries, "context switching" was a bad (and incorrect) way of calling it. Your explanation helps me understand it better too!
Ah right, I always forget Darwin has that luxury!
By change it "globally", I was initially going to change it on the caller side. I.e. do something similar as |
I meant removing the flag, and always doing the "deferred" preloading after we've loaded all the modules. |
Right, sorry just to double check: there's a bunch of direct calls to |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D86004440
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Target::GetOrCreateModule Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
cb250f6 to
fff56d6
Compare
|
Rebased, applied feedback from @JDevlieghere |
JDevlieghere
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this LGTM modulo nits.
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
Outdated
Show resolved
Hide resolved
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
Outdated
Show resolved
Hide resolved
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
|
Just a heads up, made a change since you've approved the changes. The change moves the preloading logic completely into The one downside is that calling |
clayborg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @JDevlieghere let us know if you like the Target::ModulesDidLoad doing the preloading and obeying the target settings.
JDevlieghere
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ping. Doing this in ModulesDidLoad makes sense. LGTM.
Another attempt at resolving the deadlock issue @GeorgeHuyubo discovered (his previous attempt).
This change can be summarized as the following:
GetOrCreateModulesall the way through toLoadModuleAtAddress.Module::PreloadSymbolsseparately fromTarget::GetOrCreateModuleand its callerLoadModuleAtAddress(this is what avoids the deadlock).These changes roughly maintain the performance characteristics of the previous implementation of parallel module loading. Testing on targets with between 5000 and 14000 modules, I saw similar numbers as before, often more than 10% faster in the new implementation across multiple trials for these massive targets. I think it's because we have less lock contention with this approach.
The deadlock
See bt.txt for a sample backtrace of LLDB when the deadlock occurs.
As @GeorgeHuyubo explains in his PR, the deadlock occurs from an ABBA deadlock that happens when a thread context-switches out of
Module::PreloadSymbols, goes intoTarget::GetOrCreateModulefor another module, possibly entering this block:Module::PreloadSymbolsholds a module-level mutex, and thenGetSharedModuleattempts to hold the mutex of the global sharedModuleList. So, this thread holds the module mutex, and waits on the global sharedModuleListmutex.A competing thread may execute
Target::GetOrCreateModule, enter the same block as above, grabbing the global sharedModuleListmutex. Then, inModuleList::GetSharedModule, we eventually callModuleList::FindModuleswhich eventually waits for theModulemutex held by the first thread (viaModule::GetUUID). Thus, we deadlock.Reproducing the deadlock
It might be worth noting that I've never been able to observe this deadlock issue during live debugging (e.g. launching or attaching to processes), however we were able to consistently reproduce this issue with coredumps when using the following settings:
How this change avoids this deadlock
This change avoids concurrent executions of
Module::PreloadSymbolswithTarget::GetOrCreateModuleby waiting until after theTarget::GetOrCreateModuleexecutions to runModule::PreloadSymbolsin parallel. This avoids the ordering of holding a Module lock then the ModuleList lock, asTarget::GetOrCreateModuleexecutions maintain the ordering of the shared ModuleList lock first (from what I've read and tested).Why not read-write lock?
Some feedback in #160225 was to modify mutexes used in these components with read-write locks. This might be a good idea overall, but I don't think it would easily resolve this specific deadlock.
Module::PreloadSymbolswould probably need a write lock to Module, so even if we had a read lock inModule::GetUUIDwe would still contend. Maybe theModuleListlock could be a read lock that converts to a write lock if it chooses to update the module, but it seems likely that some thread would try to update the shared module list and then the write lock would contend again.Perhaps with deeper architectural changes, we could fix this issue?
Other attempts
One downside of this approach (and the former approach of parallel module loading) is that each DYLD would need to implement this pattern themselves. With @clayborg's help, I looked at a few other approaches:
Target::GetOrCreateModule, backgrounding theModule::PreloadSymbolscall by adding it directly to the thread pool viaDebugger::GetThreadPool().async(). This required adding a lock toModule::SetLoadAddress(probably should be one there already) sinceObjectFileELF::SetLoadAddressis not thread-safe (updates sections). Unfortunately, during execution, this causes the preload symbols to run synchronously withTarget::GetOrCreateModule, preventing us from truly parallelizing the execution.Module::PreloadSymbols, backgrounding thesymtabandsym_filePreloadSymbolscalls individually, but similar issues as the above.Target::SetExecutableModuleswiftlang/llvm-project#10746 instead of the boolean I use in this change. It's functionally the same change IMO, with some design tradeoffs:Module::PreloadSymbolsitself, and can instead call whatever function is passed into the callback.GetOrCreateModulelogic, otherwise we run into the same issue. I thought this would be trickier for the caller, requiring some kinda condition variable or otherwise storing the calls to execute afterwards.Test Plan: