-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[clang][DependencyScanning] Reset options generated for named module compilations. #161486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][DependencyScanning] Reset options generated for named module compilations. #161486
Conversation
@llvm/pr-subscribers-clang-modules @llvm/pr-subscribers-clang Author: Naveen Seth Hanig (naveen-seth) ChangesThe driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash). Full diff: https://github.com/llvm/llvm-project/pull/161486.diff 2 Files Affected:
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index d67178c153e88..a117bec0d656e 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -263,6 +263,10 @@ makeCommonInvocationForModuleBuild(CompilerInvocation CI) {
// units.
CI.getFrontendOpts().Inputs.clear();
CI.getFrontendOpts().OutputFile.clear();
+ CI.getFrontendOpts().GenReducedBMI = false;
+ CI.getFrontendOpts().ModuleOutputPath.clear();
+ CI.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = false;
+ CI.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = false;
// LLVM options are not going to affect the AST
CI.getFrontendOpts().LLVMArgs.clear();
diff --git a/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp b/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp
new file mode 100644
index 0000000000000..04148953b5683
--- /dev/null
+++ b/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp
@@ -0,0 +1,111 @@
+// Check that the driver-generated command line options for C++ module inputs
+// do not unexpectedly change the context hash of imported Clang modules
+// compared to the command line for an otherwise equal non-module input.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- main.cpp
+#include "root.h"
+import A;
+import B;
+
+auto main() -> int { return 1; }
+
+//--- A.cppm
+module;
+#include "root.h"
+export module A;
+
+//--- B.cppm
+module;
+#include "root.h"
+export module B;
+
+//--- module.modulemap
+module root { header "root.h" }
+
+//--- root.h
+// empty
+
+// RUN: %clang++ -std=c++23 -fmodules %t/main.cpp %t/A.cppm %t/B.cppm -fsyntax-only -fdriver-only -MJ %t/deps.json
+// RUN: sed -e '1s/^/[/' -e '$s/,$/]/' %t/deps.json | sed 's:\\\\\?:/:g' > %t/compile_commands.json
+// RUN: clang-scan-deps -compilation-database=%t/compile_commands.json -format experimental-full > %t/result.json
+// RUN: cat %t/result.json
+
+// RUN: cat %t/result.json | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%/t
+
+// CHECK: {
+// CHECK-NEXT: "modules": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "clang-module-deps": [],
+// CHECK-NEXT: "clang-modulemap-file": "[[PREFIX]]/module.modulemap",
+// CHECK: "context-hash": "[[HASH_ROOT:.*]]",
+// CHECK-NEXT: "file-deps": [
+// CHECK-NEXT: "[[PREFIX]]/module.modulemap",
+// CHECK-NEXT: "[[PREFIX]]/root.h"
+// CHECK-NEXT: ],
+// CHECK-NEXT: "link-libraries": [],
+// CHECK-NEXT: "name": "root"
+// CHECK-NEXT: }
+// CHECK-NEXT: ],
+// CHECK-NEXT: "translation-units": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "commands": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "clang-context-hash": "{{.*}}",
+// CHECK-NEXT: "named-module-deps": [
+// CHECK-NEXT: "A",
+// CHECK-NEXT: "B"
+// CHECK-NEXT: ],
+// CHECK-NEXT: "clang-module-deps": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "context-hash": "[[HASH_ROOT]]",
+// CHECK-NEXT: "module-name": "root"
+// CHECK-NEXT: }
+// CHECK-NEXT: ],
+// CHECK: "file-deps": [
+// CHECK-NEXT: "[[PREFIX]]/main.cpp"
+// CHECK-NEXT: ],
+// CHECK-NEXT: "input-file": "[[PREFIX]]/main.cpp"
+// CHECK-NEXT: }
+// CHECK-NEXT: ]
+// CHECK-NEXT: },
+// CHECK-NEXT: {
+// CHECK-NEXT: "commands": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "clang-context-hash": "{{.*}}",
+// CHECK-NEXT: "named-module": "A",
+// CHECK-NEXT: "clang-module-deps": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "context-hash": "[[HASH_ROOT]]",
+// CHECK-NEXT: "module-name": "root"
+// CHECK-NEXT: }
+// CHECK-NEXT: ],
+// CHECK: "file-deps": [
+// CHECK-NEXT: "[[PREFIX]]/A.cppm"
+// CHECK-NEXT: ],
+// CHECK-NEXT: "input-file": "[[PREFIX]]/A.cppm"
+// CHECK-NEXT: }
+// CHECK-NEXT: ]
+// CHECK-NEXT: },
+// CHECK-NEXT: {
+// CHECK-NEXT: "commands": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "clang-context-hash": "{{.*}}",
+// CHECK-NEXT: "named-module": "B",
+// CHECK-NEXT: "clang-module-deps": [
+// CHECK-NEXT: {
+// CHECK-NEXT: "context-hash": "[[HASH_ROOT]]",
+// CHECK-NEXT: "module-name": "root"
+// CHECK-NEXT: }
+// CHECK-NEXT: ],
+// CHECK: "file-deps": [
+// CHECK-NEXT: "[[PREFIX]]/B.cppm"
+// CHECK-NEXT: ],
+// CHECK-NEXT: "input-file": "[[PREFIX]]/B.cppm"
+// CHECK-NEXT: }
+// CHECK-NEXT: ]
+// CHECK-NEXT: }
+// CHECK-NEXT: ]
+// CHECK-NEXT: }
|
Using the DOT dependency graph remark introduced in #152770, this patch changes the dependency graph for the new regression test as follows: Before this patch:Because the driver-generated -cc1 command lines for C++ named module inputs affect the canonical module build command, the Clang module root ends up with a different context hash for each import from a named module input. After this patch: |
CI.getFrontendOpts().GenReducedBMI = false; | ||
CI.getFrontendOpts().ModuleOutputPath.clear(); | ||
CI.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = false; | ||
CI.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we reset these when scanning too? At least for the Clang modules part of the scan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I added the changes in the place you intended, but I wasn’t completely sure.
I didn’t reset the options that are explicitly set for the scanning optimizations:
llvm-project/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp
Lines 496 to 497 in 78c6554
ScanInstance.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true; | |
ScanInstance.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true; |
For the other two (GenReducedBMI
and ModuleOutputPath
), I am not sure if they can have any effect when we’re not building a module, but it is probably safer.
ScanInstance.setBuildingModule(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setBuildingModule(false)
call only applies to the top-level TU. Clang modules that are imported by that TU have that set to true
. My thinking was that when we're scanning a C++ TU with named modules enabled and another C++ TU with named modules disabled, they could theoretically still share the scanning PCMs for Clang modules. (Is this assumption correct?) We can only achieve that sharing if we remove the flags implied by named C++ modules enablement from the scan invocation (GenReducedBMI
, ModuleOutputPath
). Your change in DependencyScannerImpl.cpp
looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.
Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths
and ModulesSkipDiagnosticOptions
for the build command lines here in ModuleDepCollector.cpp
? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.
I believe that they are also set for all compiler invocations which generate a BMI (here):
llvm-project/clang/lib/Frontend/CompilerInvocation.cpp
Lines 5016 to 5023 in 36dc2a9
if (Res.getFrontendOpts().GenReducedBMI || | |
Res.getFrontendOpts().ProgramAction == | |
frontend::GenerateReducedModuleInterface || | |
Res.getFrontendOpts().ProgramAction == | |
frontend::GenerateModuleInterface) { | |
Res.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true; | |
Res.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true; | |
} |
and then also make it into the build invocations: gist with debug print
Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.
Do all the the scanning modules appear in the output of clang-scan-deps -format=full-experimental
? If so, this part of the test should check this.
https://github.com/naveen-seth/llvm-project/blob/147a39de68a981a51de7d9333f7b476139c43599/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp#L39C1-L51C20
I’m not yet familiar with the internals of the scanning tooling in regards to the scanning modules. If not by adding a regression test, how could I check this? Where would the logic that controls this behavior live?
Is this assumption correct?
I believe so, but maybe some else with more knowledge of this would have to confirm.
Also thank you for reviewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.
I believe that they are also set for all compiler invocations which generate a BMI (here):
llvm-project/clang/lib/Frontend/CompilerInvocation.cpp
Lines 5016 to 5023 in 36dc2a9
if (Res.getFrontendOpts().GenReducedBMI || Res.getFrontendOpts().ProgramAction == frontend::GenerateReducedModuleInterface || Res.getFrontendOpts().ProgramAction == frontend::GenerateModuleInterface) { Res.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true; Res.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true; }
Ah, I wasn't aware of that, thanks! This change makes sense then.
Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.
Do all the the scanning modules appear in the output of
clang-scan-deps -format=full-experimental
? If so, this part of the test should check this. https://github.com/naveen-seth/llvm-project/blob/147a39de68a981a51de7d9333f7b476139c43599/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp#L39C1-L51C20I’m not yet familiar with the internals of the scanning tooling in regards to the scanning modules. If not by adding a regression test, how could I check this? Where would the logic that controls this behavior live?
No, they don't make it to the output verbatim. The scanner may create multiple internal variants of a single module (due to different -ivfsoverlay
flags for example) and after the scan decide that they can be merged into one variant (if the -ivfsoverlay
files didn't affect the module contents and can be dropped). Only the merged module variant makes it to the scanner output.
To check the scanner-internal modules, you'd have to check the contents of the module cache directory used by the scanner and figure out if there are more variants than you expect. I believe we already have some tests in the test/ClangScanDeps
directory that use find
for this.
To sum up, this change looks fine, but let's add a test for the number of scanner-internal PCM variants created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! I’ve added the test.
(The force push is just to resolve merge conflicts; no functional changes other than in 4b449c8, which adds the test)
…compilations. The driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash). This resets those options.
2fbe479
to
4b449c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
The driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash).
This resets those options.