-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[clangd] Fix dangling references to StringMap's value. #169339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
StringMap insertion may cause elements to be rehased, or the underlying storage to be reallocated, so it is generatelly unsafe to take pointers to StringMap "values". This in turn caused clangd to fail to load its on disk cache as the code depended on pointer equality (and immutability) to work. After this change, clangd will successfully re-open its index for large projects (e.g., LLVM itself), thus improving the developer experience.
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-clang-tools-extra @llvm/pr-subscribers-clangd Author: John Porto (jpporto) ChangesStringMap insertion may cause elements to be rehased, or the underlying storage to be reallocated, so it is generatelly unsafe to take pointers to StringMap "values". This in turn caused clangd to fail to load its on disk cache as the code depended on pointer equality (and immutability) to work. After this change, clangd will successfully re-open its index for large projects (e.g., LLVM itself), thus improving the developer experience. Full diff: https://github.com/llvm/llvm-project/pull/169339.diff 2 Files Affected:
diff --git a/clang-tools-extra/clangd/GlobalCompilationDatabase.cpp b/clang-tools-extra/clangd/GlobalCompilationDatabase.cpp
index c6afd0bc07cbd..43e27b3f21d1a 100644
--- a/clang-tools-extra/clangd/GlobalCompilationDatabase.cpp
+++ b/clang-tools-extra/clangd/GlobalCompilationDatabase.cpp
@@ -398,8 +398,12 @@ DirectoryBasedGlobalCompilationDatabase::getDirectoryCaches(
Ret.reserve(Dirs.size());
std::lock_guard<std::mutex> Lock(DirCachesMutex);
- for (unsigned I = 0; I < Dirs.size(); ++I)
- Ret.push_back(&DirCaches.try_emplace(FoldedDirs[I], Dirs[I]).first->second);
+ for (unsigned I = 0; I < Dirs.size(); ++I) {
+ std::unique_ptr<DirectoryCache> &DC = DirCaches[FoldedDirs[I]];
+ if (!DC)
+ DC = std::make_unique<DirectoryCache>(Dirs[I]);
+ Ret.push_back(DC.get());
+ }
return Ret;
}
@@ -571,7 +575,7 @@ class DirectoryBasedGlobalCompilationDatabase::BroadcastThread::Filter {
enum { Unknown, Missing, TargetCDB, OtherCDB } State = Unknown;
DirInfo *Parent = nullptr;
};
- llvm::StringMap<DirInfo> Dirs;
+ llvm::StringMap<std::unique_ptr<DirInfo>> Dirs;
// A search path starts at a directory, and either includes ancestors or not.
using SearchPath = llvm::PointerIntPair<DirInfo *, 1>;
@@ -583,16 +587,18 @@ class DirectoryBasedGlobalCompilationDatabase::BroadcastThread::Filter {
DirInfo *Child = nullptr;
actOnAllParentDirectories(FilePath, [&](llvm::StringRef Dir) {
auto &Info = Dirs[Dir];
+ if (!Info)
+ Info = std::make_unique<DirInfo>();
// If this is the first iteration, then this node is the overall result.
if (!Leaf)
- Leaf = &Info;
+ Leaf = Info.get();
// Fill in the parent link from the previous iteration to this parent.
if (Child)
- Child->Parent = &Info;
+ Child->Parent = Info.get();
// Keep walking, whether we inserted or not, if parent link is missing.
// (If it's present, parent links must be present up to the root, so stop)
- Child = &Info;
- return Info.Parent != nullptr;
+ Child = Info.get();
+ return Info->Parent != nullptr;
});
return Leaf;
}
@@ -609,7 +615,7 @@ class DirectoryBasedGlobalCompilationDatabase::BroadcastThread::Filter {
DirValues.reserve(Dirs.size());
for (auto &E : Dirs) {
DirKeys.push_back(E.first());
- DirValues.push_back(&E.second);
+ DirValues.push_back(E.second.get());
}
// Also look up the cache entry for the CDB we're broadcasting.
@@ -677,7 +683,10 @@ class DirectoryBasedGlobalCompilationDatabase::BroadcastThread::Filter {
std::vector<SearchPath> SearchPaths(AllFiles.size());
for (unsigned I = 0; I < AllFiles.size(); ++I) {
if (Parent.Opts.CompileCommandsDir) { // FIXME: unify with config
- SearchPaths[I].setPointer(&Dirs[*Parent.Opts.CompileCommandsDir]);
+ std::unique_ptr<DirInfo> &Dir = Dirs[*Parent.Opts.CompileCommandsDir];
+ if (!Dir)
+ Dir = std::make_unique<DirInfo>();
+ SearchPaths[I].setPointer(Dir.get());
continue;
}
if (ExitEarly()) // loading config may be slow
@@ -693,7 +702,10 @@ class DirectoryBasedGlobalCompilationDatabase::BroadcastThread::Filter {
SearchPaths[I].setPointer(addParents(AllFiles[I]));
break;
case Config::CDBSearchSpec::FixedDir:
- SearchPaths[I].setPointer(&Dirs[*Spec.FixedCDBPath]);
+ std::unique_ptr<DirInfo> &Dir = Dirs[*Spec.FixedCDBPath];
+ if (!Dir)
+ Dir = std::make_unique<DirInfo>();
+ SearchPaths[I].setPointer(Dir.get());
break;
}
}
diff --git a/clang-tools-extra/clangd/GlobalCompilationDatabase.h b/clang-tools-extra/clangd/GlobalCompilationDatabase.h
index 1d636d73664be..2a8e3821c4596 100644
--- a/clang-tools-extra/clangd/GlobalCompilationDatabase.h
+++ b/clang-tools-extra/clangd/GlobalCompilationDatabase.h
@@ -143,7 +143,7 @@ class DirectoryBasedGlobalCompilationDatabase
class DirectoryCache;
// Keyed by possibly-case-folded directory path.
// We can hand out pointers as they're stable and entries are never removed.
- mutable llvm::StringMap<DirectoryCache> DirCaches;
+ mutable llvm::StringMap<std::unique_ptr<DirectoryCache>> DirCaches;
mutable std::mutex DirCachesMutex;
std::vector<DirectoryCache *>
|
|
Nice! Can you please add a testcase? |
|
|
I am unsure how to add a test for this -- the error is non-deterministic. Moreover, the original commit didn't add a test, so it is hard to base off of that. I am open to suggestions, though. |
|
@bcardosolopes pointed me to clang-tools-extra/clangd/unittests/GlobalCompilationDatabaseTests.cpp, so I spent some time looking at that code. I still don't know how I can write a test. The scenario I want to repro is what happens during clangd's initialization when it lists all files in the compile_commands.json, and decides whether or not it should rebuild the index. In that scenario the |
Are you sure about this? The comment above the declaration of And while I haven't studied the implementation of |
|
In fact, I can give |
|
Thanks for the pointers, truly appreciated. I am not sure what was happening on my machine (my clangd was built with a revision from a few weeks ago, so probably a broken version either). I am closing the PR for now until I can repro the issue on my side (hopefully never). Cheers. |
StringMap insertion may cause elements to be rehased, or the underlying storage to be reallocated, so it is generatelly unsafe to take pointers to StringMap "values". This in turn caused clangd to fail to load its on disk cache as the code depended on pointer equality (and immutability) to work.
After this change, clangd will successfully re-open its index for large projects (e.g., LLVM itself), thus improving the developer experience.