Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -509,6 +509,8 @@ bool initializeScanCompilerInstance(
ScanInstance.getFrontendOpts().DisableFree = false;
ScanInstance.getFrontendOpts().GenerateGlobalModuleIndex = false;
ScanInstance.getFrontendOpts().UseGlobalModuleIndex = false;
ScanInstance.getFrontendOpts().GenReducedBMI = false;
ScanInstance.getFrontendOpts().ModuleOutputPath.clear();
// This will prevent us compiling individual modules asynchronously since
// FileManager is not thread-safe, but it does improve performance for now.
ScanInstance.getFrontendOpts().ModulesShareFileManager = true;
Expand Down
4 changes: 4 additions & 0 deletions clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,10 @@ makeCommonInvocationForModuleBuild(CompilerInvocation CI) {
// units.
CI.getFrontendOpts().Inputs.clear();
CI.getFrontendOpts().OutputFile.clear();
CI.getFrontendOpts().GenReducedBMI = false;
CI.getFrontendOpts().ModuleOutputPath.clear();
CI.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = false;
CI.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = false;
Comment on lines +266 to +269
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we reset these when scanning too? At least for the Clang modules part of the scan.

Copy link
Contributor Author

@naveen-seth naveen-seth Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I added the changes in the place you intended, but I wasn’t completely sure.
I didn’t reset the options that are explicitly set for the scanning optimizations:

ScanInstance.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true;
ScanInstance.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;

For the other two (GenReducedBMI and ModuleOutputPath), I am not sure if they can have any effect when we’re not building a module, but it is probably safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setBuildingModule(false) call only applies to the top-level TU. Clang modules that are imported by that TU have that set to true. My thinking was that when we're scanning a C++ TU with named modules enabled and another C++ TU with named modules disabled, they could theoretically still share the scanning PCMs for Clang modules. (Is this assumption correct?) We can only achieve that sharing if we remove the flags implied by named C++ modules enablement from the scan invocation (GenReducedBMI, ModuleOutputPath). Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.

Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.

Copy link
Contributor Author

@naveen-seth naveen-seth Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.

I believe that they are also set for all compiler invocations which generate a BMI (here):

if (Res.getFrontendOpts().GenReducedBMI ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateReducedModuleInterface ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateModuleInterface) {
Res.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true;
Res.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;
}

and then also make it into the build invocations: gist with debug print

Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.

Do all the the scanning modules appear in the output of clang-scan-deps -format=full-experimental? If so, this part of the test should check this.
https://github.com/naveen-seth/llvm-project/blob/147a39de68a981a51de7d9333f7b476139c43599/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp#L39C1-L51C20

I’m not yet familiar with the internals of the scanning tooling in regards to the scanning modules. If not by adding a regression test, how could I check this? Where would the logic that controls this behavior live?

Is this assumption correct?

I believe so, but maybe some else with more knowledge of this would have to confirm.
Also thank you for reviewing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why you're now also resetting ModulesSkipHeaderSearchPaths and ModulesSkipDiagnosticOptions for the build command lines here in ModuleDepCollector.cpp? AFAIK these are only used within the scanner to increase performance - they don't make it into the build invocations, nor should the driver invocations that scanner takes as an input have them enabled.

I believe that they are also set for all compiler invocations which generate a BMI (here):

if (Res.getFrontendOpts().GenReducedBMI ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateReducedModuleInterface ||
Res.getFrontendOpts().ProgramAction ==
frontend::GenerateModuleInterface) {
Res.getHeaderSearchOpts().ModulesSkipDiagnosticOptions = true;
Res.getHeaderSearchOpts().ModulesSkipHeaderSearchPaths = true;
}

Ah, I wasn't aware of that, thanks! This change makes sense then.

Your change in DependencyScannerImpl.cpp looks good, but it'd be good to have a test that checks the scanner itself doesn't produce more than one PCM variant in this scenario.

Do all the the scanning modules appear in the output of clang-scan-deps -format=full-experimental? If so, this part of the test should check this. https://github.com/naveen-seth/llvm-project/blob/147a39de68a981a51de7d9333f7b476139c43599/clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp#L39C1-L51C20

I’m not yet familiar with the internals of the scanning tooling in regards to the scanning modules. If not by adding a regression test, how could I check this? Where would the logic that controls this behavior live?

No, they don't make it to the output verbatim. The scanner may create multiple internal variants of a single module (due to different -ivfsoverlay flags for example) and after the scan decide that they can be merged into one variant (if the -ivfsoverlay files didn't affect the module contents and can be dropped). Only the merged module variant makes it to the scanner output.

To check the scanner-internal modules, you'd have to check the contents of the module cache directory used by the scanner and figure out if there are more variants than you expect. I believe we already have some tests in the test/ClangScanDeps directory that use find for this.

To sum up, this change looks fine, but let's add a test for the number of scanner-internal PCM variants created.

Copy link
Contributor Author

@naveen-seth naveen-seth Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! I’ve added the test.
(The force push is just to resolve merge conflicts; no functional changes other than in 4b449c8, which adds the test)

// LLVM options are not going to affect the AST
CI.getFrontendOpts().LLVMArgs.clear();

Expand Down
121 changes: 121 additions & 0 deletions clang/test/ClangScanDeps/modules-context-hash-from-named-module.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
// Checks that driver-generated options for C++ module inputs preserve the
// canonical module build commands compared to an equivalent non-module input,
// and that they do not produce additional internal scanning PCMs.

// RUN: rm -rf %t
// RUN: split-file %s %t

//--- main.cpp
#include "root.h"
import A;
import B;

auto main() -> int { return 1; }

//--- A.cppm
module;
#include "root.h"
export module A;

//--- B.cppm
module;
#include "root.h"
export module B;

//--- module.modulemap
module root { header "root.h" }

//--- root.h
// empty

// RUN: %clang -std=c++23 -fmodules \
// RUN: -fmodules-cache-path=%t/modules-cache \
// RUN: %t/main.cpp %t/A.cppm %t/B.cppm \
// RUN: -fsyntax-only -fdriver-only -MJ %t/deps.json
//
// RUN: sed -e '1s/^/[/' -e '$s/,$/]/' -e 's:\\\\\?:/:g' %t/deps.json \
// RUN: > %t/compile_commands.json
//
// RUN: clang-scan-deps \
// RUN: -compilation-database=%t/compile_commands.json \
// RUN: -format experimental-full \
// RUN: | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%/t

// CHECK: {
// CHECK-NEXT: "modules": [
// CHECK-NEXT: {
// CHECK-NEXT: "clang-module-deps": [],
// CHECK-NEXT: "clang-modulemap-file": "[[PREFIX]]/module.modulemap",
// CHECK: "context-hash": "[[HASH_ROOT:.*]]",
// CHECK-NEXT: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/module.modulemap",
// CHECK-NEXT: "[[PREFIX]]/root.h"
// CHECK-NEXT: ],
// CHECK-NEXT: "link-libraries": [],
// CHECK-NEXT: "name": "root"
// CHECK-NEXT: }
// CHECK-NEXT: ],
// CHECK-NEXT: "translation-units": [
// CHECK-NEXT: {
// CHECK-NEXT: "commands": [
// CHECK-NEXT: {
// CHECK-NEXT: "clang-context-hash": "{{.*}}",
// CHECK-NEXT: "named-module-deps": [
// CHECK-NEXT: "A",
// CHECK-NEXT: "B"
// CHECK-NEXT: ],
// CHECK-NEXT: "clang-module-deps": [
// CHECK-NEXT: {
// CHECK-NEXT: "context-hash": "[[HASH_ROOT]]",
// CHECK-NEXT: "module-name": "root"
// CHECK-NEXT: }
// CHECK-NEXT: ],
// CHECK: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/main.cpp"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/main.cpp"
// CHECK-NEXT: }
// CHECK-NEXT: ]
// CHECK-NEXT: },
// CHECK-NEXT: {
// CHECK-NEXT: "commands": [
// CHECK-NEXT: {
// CHECK-NEXT: "clang-context-hash": "{{.*}}",
// CHECK-NEXT: "named-module": "A",
// CHECK-NEXT: "clang-module-deps": [
// CHECK-NEXT: {
// CHECK-NEXT: "context-hash": "[[HASH_ROOT]]",
// CHECK-NEXT: "module-name": "root"
// CHECK-NEXT: }
// CHECK-NEXT: ],
// CHECK: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/A.cppm"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/A.cppm"
// CHECK-NEXT: }
// CHECK-NEXT: ]
// CHECK-NEXT: },
// CHECK-NEXT: {
// CHECK-NEXT: "commands": [
// CHECK-NEXT: {
// CHECK-NEXT: "clang-context-hash": "{{.*}}",
// CHECK-NEXT: "named-module": "B",
// CHECK-NEXT: "clang-module-deps": [
// CHECK-NEXT: {
// CHECK-NEXT: "context-hash": "[[HASH_ROOT]]",
// CHECK-NEXT: "module-name": "root"
// CHECK-NEXT: }
// CHECK-NEXT: ],
// CHECK: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/B.cppm"
// CHECK-NEXT: ],
// CHECK-NEXT: "input-file": "[[PREFIX]]/B.cppm"
// CHECK-NEXT: }
// CHECK-NEXT: ]
// CHECK-NEXT: }
// CHECK-NEXT: ]
// CHECK-NEXT: }

// This tests that the scanner doesn't produce multiple internal scanning PCMs
// for our single Clang module (root).
// RUN: find %t/modules-cache -name "*.pcm" | wc -l | grep 1