-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][DependencyScanning] Track modules that resolve from "shared" locations #130634
Conversation
That patch tracks whether all the file & module dependencies of a module resolve to a sysroot location. This information will later be queried by build systems for determining where to store the accompanying pcms.
@llvm/pr-subscribers-clang Author: Cyndy Ishida (cyndyishida) ChangesThat patch tracks whether all the file & module dependencies of a module resolve to a sysroot location. This information will later be queried by build systems for determining where to store the accompanying pcms. Full diff: https://github.com/llvm/llvm-project/pull/130634.diff 4 Files Affected:
diff --git a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 20fb4de6a2a73..6187f0168e6d9 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -114,6 +114,10 @@ struct ModuleDeps {
/// Whether this is a "system" module.
bool IsSystem;
+ /// Whether this is a module where it's dependencies resolve within the
+ /// sysroot.
+ bool IsInSysroot;
+
/// The path to the modulemap file which defines this module.
///
/// This can be used to explicitly build this module. This file will
@@ -219,6 +223,9 @@ class ModuleDepCollectorPP final : public PPCallbacks {
llvm::DenseSet<const Module *> &AddedModules);
void addAffectingClangModule(const Module *M, ModuleDeps &MD,
llvm::DenseSet<const Module *> &AddedModules);
+
+ /// Add discovered module dependency for the given module.
+ void addClangModule(const Module *M, const ModuleID ID, ModuleDeps &MD);
};
/// Collects modular and non-modular dependencies of the main file by attaching
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 36b75c1016cd8..86eda34472cf0 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -698,6 +698,15 @@ ModuleDepCollectorPP::handleTopLevelModule(const Module *M) {
MD.ID.ModuleName = M->getFullModuleName();
MD.IsSystem = M->IsSystem;
+
+ // Start off with the assumption that this module is in the sysroot when there
+ // is a sysroot provided. As more dependencies are discovered, check if those
+ // come from the provided sysroot.
+ const StringRef CurrSysroot = MDC.ScanInstance.getHeaderSearchOpts().Sysroot;
+ MD.IsInSysroot =
+ !CurrSysroot.empty() &&
+ (llvm::sys::path::root_directory(CurrSysroot) != CurrSysroot);
+
// For modules which use export_as link name, the linked product that of the
// corresponding export_as-named module.
if (!M->UseExportAsModuleLinkName)
@@ -739,6 +748,11 @@ ModuleDepCollectorPP::handleTopLevelModule(const Module *M) {
MDC.ScanInstance.getASTReader()->visitInputFileInfos(
*MF, /*IncludeSystem=*/true,
[&](const serialization::InputFileInfo &IFI, bool IsSystem) {
+ auto FullFilePath = ASTReader::ResolveImportedPath(
+ PathBuf, IFI.UnresolvedImportedFilename, MF->BaseDirectory);
+ if (MD.IsInSysroot)
+ MD.IsInSysroot = FullFilePath->starts_with(CurrSysroot);
+ PathBuf.resize_for_overwrite(256);
if (!(IFI.TopLevel && IFI.ModuleMap))
return;
if (IFI.UnresolvedImportedFilenameAsRequested.ends_with(
@@ -835,6 +849,13 @@ void ModuleDepCollectorPP::addAllSubmoduleDeps(
});
}
+void ModuleDepCollectorPP::addClangModule(const Module *M, const ModuleID ID,
+ ModuleDeps &MD) {
+ MD.ClangModuleDeps.push_back(ID);
+ if (MD.IsInSysroot)
+ MD.IsInSysroot = MDC.ModularDeps[M]->IsInSysroot;
+}
+
void ModuleDepCollectorPP::addModuleDep(
const Module *M, ModuleDeps &MD,
llvm::DenseSet<const Module *> &AddedModules) {
@@ -843,7 +864,7 @@ void ModuleDepCollectorPP::addModuleDep(
!MDC.isPrebuiltModule(Import)) {
if (auto ImportID = handleTopLevelModule(Import->getTopLevelModule()))
if (AddedModules.insert(Import->getTopLevelModule()).second)
- MD.ClangModuleDeps.push_back(*ImportID);
+ addClangModule(Import->getTopLevelModule(), *ImportID, MD);
}
}
}
@@ -867,7 +888,7 @@ void ModuleDepCollectorPP::addAffectingClangModule(
!MDC.isPrebuiltModule(Affecting)) {
if (auto ImportID = handleTopLevelModule(Affecting))
if (AddedModules.insert(Affecting).second)
- MD.ClangModuleDeps.push_back(*ImportID);
+ addClangModule(Affecting, *ImportID, MD);
}
}
}
diff --git a/clang/test/ClangScanDeps/modules-in-sysroot.c b/clang/test/ClangScanDeps/modules-in-sysroot.c
new file mode 100644
index 0000000000000..d96aa69c0e8f4
--- /dev/null
+++ b/clang/test/ClangScanDeps/modules-in-sysroot.c
@@ -0,0 +1,107 @@
+// This test verifies modules that are entirely comprised from sysroot inputs are captured in
+// dependency information.
+
+// The first compilation verifies that transitive dependencies on non-sysroot input are captured.
+// The second compilation verifies that external paths are resolved when a vfsoverlay is applied when considering sysroot-ness.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/compile-commands.json.in > %t/compile-commands.json
+// RUN: sed -e "s|DIR|%/t|g" %t/overlay.json.template > %t/overlay.json
+// RUN: clang-scan-deps -compilation-database %t/compile-commands.json \
+// RUN: -j 1 -format experimental-full > %t/deps.db
+// RUN: cat %t/deps.db | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%/t
+
+// CHECK: "modules": [
+// CHECK-NEXT: {
+// CHECK: "is-in-sysroot": true,
+// CHECK: "name": "A"
+
+// Verify that there are no more occurances of sysroot.
+// CHECK-NOT: "is-in-sysroot"
+
+// CHECK: "name": "A"
+// CHECK: "USE_VFS"
+// CHECK: "name": "B"
+// CHECK: "name": "C"
+// CHECK: "name": "D"
+// CHECK: "name": "NotInSDK"
+
+//--- compile-commands.json.in
+[
+{
+ "directory": "DIR",
+ "command": "clang -c DIR/client.m -isysroot DIR/MacOSX.sdk -I DIR/BuildDir -fmodules -fmodules-cache-path=DIR/module-cache -fimplicit-module-maps",
+ "file": "DIR/client.m"
+},
+{
+ "directory": "DIR",
+ "command": "clang -c DIR/client.m -isysroot DIR/MacOSX.sdk -ivfsoverlay DIR/overlay.json -DUSE_VFS -I DIR/BuildDir -fmodules -fmodules-cache-path=DIR/module-cache -fimplicit-module-maps",
+ "file": "DIR/client.m"
+}
+]
+
+//--- overlay.json.template
+{
+ "version": 0,
+ "case-sensitive": "false",
+ "roots": [
+ {
+ "external-contents": "DIR/local/A/A_vfs.h",
+ "name": "DIR/MacOSX.sdk/usr/include/A/A_vfs.h",
+ "type": "file"
+ }
+ ]
+}
+
+//--- MacOSX.sdk/usr/include/A/module.modulemap
+module A {
+ umbrella "."
+}
+
+//--- MacOSX.sdk/usr/include/A/A.h
+#ifdef USE_VFS
+#include <A/A_vfs.h>
+#endif
+typedef int A_t;
+
+//--- local/A/A_vfs.h
+typedef int typeFromVFS;
+
+//--- MacOSX.sdk/usr/include/B/module.modulemap
+module B [system] {
+ umbrella "."
+}
+
+//--- MacOSX.sdk/usr/include/B/B.h
+#include <C/C.h>
+typedef int B_t;
+
+//--- MacOSX.sdk/usr/include/C/module.modulemap
+module C [system] {
+ umbrella "."
+}
+
+//--- MacOSX.sdk/usr/include/C/C.h
+#include <D/D.h>
+
+//--- MacOSX.sdk/usr/include/D/module.modulemap
+module D [system] {
+ umbrella "."
+}
+
+// Simulate a header that will be resolved in a local directory, from a sysroot header.
+//--- MacOSX.sdk/usr/include/D/D.h
+#include <HeaderNotFoundInSDK.h>
+
+//--- BuildDir/module.modulemap
+module NotInSDK [system] {
+ umbrella "."
+}
+
+//--- BuildDir/HeaderNotFoundInSDK.h
+typedef int local_t;
+
+//--- client.m
+#include <A/A.h>
+#include <B/B.h>
diff --git a/clang/tools/clang-scan-deps/ClangScanDeps.cpp b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
index 3bdeb461e4bfa..f5946b30fb84d 100644
--- a/clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -471,6 +471,8 @@ class FullDeps {
for (auto &&ModID : ModuleIDs) {
auto &MD = Modules[ModID];
JOS.object([&] {
+ if (MD.IsInSysroot)
+ JOS.attribute("is-in-sysroot", MD.IsInSysroot);
JOS.attributeArray("clang-module-deps",
toJSONSorted(JOS, MD.ClangModuleDeps));
JOS.attribute("clang-modulemap-file",
|
I don't know what the desired properties of this feature are exactly, but I'm concerned that this does not guarantee the resulting command line only refers to inputs in the sysroot. Imagine that we run this without the header search and VFS pruning optimizations - the command line will likely refer to unused project-specific source/build directories. Even with those optimizations on, I'm not convinced we have great guarantees regarding the sysroot-only nature of the command line. I would like to see stronger guarantee. |
I would think that if
Do you have an intuition of whether this would be noticeably expensive for the scanner? |
dbc748c
to
fd4abaa
Compare
clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a similar concern to @jansvoboda11 that we probably need to check the invocation paths as well.
I would think that if ModuleDepCollector only found dependency inputs that resolve to sysroot locations, the command line for building the specific module wouldn't matter.
One concern I have is if we have user search paths, the contents of those paths can change after scanning in a way that silently affects the compilation. Another more theoretical issue is that in general the set of module input files has no guarantee of being complete -- it's derived from the source manager and we don't necessarily have a mechanism to ensure every file that impacted the module was captured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this looks good. I'm not sure I share the same concerns. The situation this covers is how likely is it that the build system is going to determine this needs to be rebuilt, and that happens (in most cases) purely based on the reported dependencies. So negative deps or unlisted deps don't change anything here.
I can add the checks for inputs through the compiler invocations, with the hope that it will allow for more pcm sharing across build plans (e.g. projects) depending on the same modules. I am a little worried that a more conservative check would mean fewer modules that otherwise could safely be in a shared location don't end up there due to other limitations (e.g. we don't prune out an option that is safe to), but I don't have a sense if that is a practical concern. |
* We need to track more than just sysroot, so start with sysroot and resource directory as a starting point, but allow the ability to capture more as they are discovered in the future. * Check a subset of the compiler invocation of module compilations for determining if a module can be shared. * When a module has a prebuilt module dependency, consider is not shareable for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm now, just a request to change the name.
clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
Outdated
Show resolved
Hide resolved
…locations (llvm#130634) That patch tracks whether all the file & module dependencies of a module resolve to a stable location. This information will later be queried by build systems for determining where to store the accompanying pcms. (cherry picked from commit 584f8cc)
…locations (llvm#130634) That patch tracks whether all the file & module dependencies of a module resolve to a stable location. This information will later be queried by build systems for determining where to store the accompanying pcms. (cherry picked from commit 584f8cc)
…locations (llvm#130634) That patch tracks whether all the file & module dependencies of a module resolve to a stable location. This information will later be queried by build systems for determining where to store the accompanying pcms. (cherry picked from commit 584f8cc)
That patch tracks whether all the file & module dependencies of a module resolve to a sysroot location. This information will later be queried by build systems for determining where to store the accompanying pcms.