Skip to content

Conversation

naveen-seth
Copy link
Contributor

The driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash).
This resets those options.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 1, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 1, 2025

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Naveen Seth Hanig (naveen-seth)

Changes

The driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash).
This resets those options.


Full diff: https://github.com/llvm/llvm-project/pull/161486.diff

2 Files Affected:

  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+4)
  • (added) clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp (+111)
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index d67178c153e88..a117bec0d656e 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -263,6 +263,10 @@ makeCommonInvocationForModuleBuild(CompilerInvocation CI) {
   // units.
   CI.getFrontendOpts().Inputs.clear();
   CI.getFrontendOpts().OutputFile.clear();
+  CI.getFrontendOpts().GenReducedBMI = false;
+  CI.getFrontendOpts().ModuleOutputPath.clear();
+  CI.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = false;
+  CI.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = false;
   // LLVM options are not going to affect the AST
   CI.getFrontendOpts().LLVMArgs.clear();
 
diff --git a/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp b/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp
new file mode 100644
index 0000000000000..04148953b5683
--- /dev/null
+++ b/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp
@@ -0,0 +1,111 @@
+// Check that the driver-generated command line options for  C++ module inputs
+// do not unexpectedly change the context hash of imported Clang modules
+// compared to the command line for an otherwise equal non-module input.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- main.cpp
+#include "root.h"
+import A;
+import B;
+
+auto main() -> int { return 1; }
+
+//--- A.cppm
+module;
+#include "root.h"
+export module A;
+
+//--- B.cppm
+module;
+#include "root.h"
+export module B;
+
+//--- module.modulemap
+module root { header "root.h" }
+
+//--- root.h
+// empty
+
+// RUN: %clang++ -std=c++23 -fmodules %t/main.cpp %t/A.cppm %t/B.cppm -fsyntax-only -fdriver-only -MJ %t/deps.json
+// RUN: sed -e '1s/^/[/' -e '$s/,$/]/' %t/deps.json | sed 's:\\\\\?:/:g' > %t/compile_commands.json
+// RUN: clang-scan-deps -compilation-database=%t/compile_commands.json -format experimental-full > %t/result.json
+// RUN: cat %t/result.json
+
+// RUN: cat %t/result.json | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%/t
+
+// CHECK:      {
+// CHECK-NEXT:   "modules": [
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "clang-module-deps": [],
+// CHECK-NEXT:       "clang-modulemap-file": "[[PREFIX]]/module.modulemap",
+// CHECK:            "context-hash": "[[HASH_ROOT:.*]]",
+// CHECK-NEXT:       "file-deps": [
+// CHECK-NEXT:         "[[PREFIX]]/module.modulemap",
+// CHECK-NEXT:         "[[PREFIX]]/root.h"
+// CHECK-NEXT:       ],
+// CHECK-NEXT:       "link-libraries": [],
+// CHECK-NEXT:       "name": "root"
+// CHECK-NEXT:     }
+// CHECK-NEXT:   ],
+// CHECK-NEXT:   "translation-units": [
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "commands": [
+// CHECK-NEXT:         {
+// CHECK-NEXT:           "clang-context-hash": "{{.*}}",
+// CHECK-NEXT:           "named-module-deps": [
+// CHECK-NEXT:             "A",
+// CHECK-NEXT:             "B"
+// CHECK-NEXT:           ],
+// CHECK-NEXT:           "clang-module-deps": [
+// CHECK-NEXT:             {
+// CHECK-NEXT:               "context-hash": "[[HASH_ROOT]]",
+// CHECK-NEXT:               "module-name": "root"
+// CHECK-NEXT:             }
+// CHECK-NEXT:           ],
+// CHECK:                "file-deps": [
+// CHECK-NEXT:             "[[PREFIX]]/main.cpp"
+// CHECK-NEXT:           ],
+// CHECK-NEXT:           "input-file": "[[PREFIX]]/main.cpp"
+// CHECK-NEXT:         }
+// CHECK-NEXT:       ]
+// CHECK-NEXT:     },
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "commands": [
+// CHECK-NEXT:         {
+// CHECK-NEXT:           "clang-context-hash": "{{.*}}",
+// CHECK-NEXT:           "named-module": "A",
+// CHECK-NEXT:           "clang-module-deps": [
+// CHECK-NEXT:             {
+// CHECK-NEXT:               "context-hash": "[[HASH_ROOT]]",
+// CHECK-NEXT:               "module-name": "root"
+// CHECK-NEXT:             }
+// CHECK-NEXT:           ],
+// CHECK:                "file-deps": [
+// CHECK-NEXT:             "[[PREFIX]]/A.cppm"
+// CHECK-NEXT:           ],
+// CHECK-NEXT:           "input-file": "[[PREFIX]]/A.cppm"
+// CHECK-NEXT:         }
+// CHECK-NEXT:       ]
+// CHECK-NEXT:     },
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "commands": [
+// CHECK-NEXT:         {
+// CHECK-NEXT:           "clang-context-hash": "{{.*}}",
+// CHECK-NEXT:           "named-module": "B",
+// CHECK-NEXT:           "clang-module-deps": [
+// CHECK-NEXT:             {
+// CHECK-NEXT:               "context-hash": "[[HASH_ROOT]]",
+// CHECK-NEXT:               "module-name": "root"
+// CHECK-NEXT:             }
+// CHECK-NEXT:           ],
+// CHECK:                "file-deps": [
+// CHECK-NEXT:             "[[PREFIX]]/B.cppm"
+// CHECK-NEXT:           ],
+// CHECK-NEXT:           "input-file": "[[PREFIX]]/B.cppm"
+// CHECK-NEXT:         }
+// CHECK-NEXT:       ]
+// CHECK-NEXT:     }
+// CHECK-NEXT:   ]
+// CHECK-NEXT: }

@naveen-seth
Copy link
Contributor Author

naveen-seth commented Oct 1, 2025

Using the DOT dependency graph remark introduced in #152770, this patch changes the dependency graph for the new regression test as follows:

Before this patch:

graph-before-fix

Because the driver-generated -cc1 command lines for C++ named module inputs affect the canonical module build command, the Clang module root ends up with a different context hash for each import from a named module input.

After this patch:

graph-after-fix

@naveen-seth naveen-seth added the clang:modules C++20 modules and Clang Header Modules label Oct 1, 2025
Comment on lines +266 to +269
CI.getFrontendOpts().GenReducedBMI = false;
CI.getFrontendOpts().ModuleOutputPath.clear();
CI.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = false;
CI.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we reset these when scanning too? At least for the Clang modules part of the scan.

Copy link
Contributor Author

@naveen-seth naveen-seth Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I added the changes in the place you intended, but I wasn’t completely sure.
I didn’t reset the options that are explicitly set for the scanning optimizations:

ScanInstance.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true;
ScanInstance.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;

For the other two (GenReducedBMI and ModuleOutputPath), I am not sure if they can have any effect when we’re not building a module, but it is probably safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setBuildingModule(false) call only applies to the top-level TU. Clang modules that are imported by that TU have that set to true. My thinking was that when we're scanning a C++ TU with named modules enabled and another C++ TU with named modules disabled, they could theoretically still share the scanning PCMs for Clang modules. (Is this assumption correct?) We can only achieve that sharing if we remove the flags implied by named C++ modules enablement from the scan invocation (GenReducedBMI, ModuleOutputPath). Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.

Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.

Copy link
Contributor Author

@naveen-seth naveen-seth Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.

I believe that they are also set for all compiler invocations which generate a BMI (here):

if (Res.getFrontendOpts().GenReducedBMI ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateReducedModuleInterface ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateModuleInterface) {
Res.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true;
Res.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;
}

and then also make it into the build invocations: gist with debug print

Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.

Do all the the scanning modules appear in the output of clang-scan-deps -format=full-experimental? If so, this part of the test should check this.
https://github.com/naveen-seth/llvm-project/blob/147a39de68a981a51de7d9333f7b476139c43599/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp#L39C1-L51C20

I’m not yet familiar with the internals of the scanning tooling in regards to the scanning modules. If not by adding a regression test, how could I check this? Where would the logic that controls this behavior live?

Is this assumption correct?

I believe so, but maybe some else with more knowledge of this would have to confirm.
Also thank you for reviewing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.

I believe that they are also set for all compiler invocations which generate a BMI (here):

if (Res.getFrontendOpts().GenReducedBMI ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateReducedModuleInterface ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateModuleInterface) {
Res.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true;
Res.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;
}

Ah, I wasn't aware of that, thanks! This change makes sense then.

Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.

Do all the the scanning modules appear in the output of clang-scan-deps -format=full-experimental? If so, this part of the test should check this. https://github.com/naveen-seth/llvm-project/blob/147a39de68a981a51de7d9333f7b476139c43599/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp#L39C1-L51C20

I’m not yet familiar with the internals of the scanning tooling in regards to the scanning modules. If not by adding a regression test, how could I check this? Where would the logic that controls this behavior live?

No, they don't make it to the output verbatim. The scanner may create multiple internal variants of a single module (due to different -ivfsoverlay flags for example) and after the scan decide that they can be merged into one variant (if the -ivfsoverlay files didn't affect the module contents and can be dropped). Only the merged module variant makes it to the scanner output.

To check the scanner-internal modules, you'd have to check the contents of the module cache directory used by the scanner and figure out if there are more variants than you expect. I believe we already have some tests in the test/ClangScanDeps directory that use find for this.

To sum up, this change looks fine, but let's add a test for the number of scanner-internal PCM variants created.

Copy link
Contributor Author

@naveen-seth naveen-seth Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! I’ve added the test.
(The force push is just to resolve merge conflicts; no functional changes other than in 4b449c8, which adds the test)

…compilations.

The driver-generated -cc1 command-lines for C++ named module inputs
introduce some command-line options which affect the canonical module
build command (and therefore the context hash).
This resets those options.
@naveen-seth naveen-seth force-pushed the fix-context-hash-for-imports-from-named-mod branch from 2fbe479 to 4b449c8 Compare October 4, 2025 13:29
Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@naveen-seth naveen-seth merged commit 2f3bb76 into llvm:main Oct 6, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants