Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][modules] Only compute affecting module maps with implicit search #87849

Merged

Conversation

jansvoboda11
Copy link
Contributor

@jansvoboda11 jansvoboda11 commented Apr 6, 2024

When writing out a PCM, we compute the set of module maps that did affect the compilation and we strip the rest to make the output independent of them. The most common way to read a module map that is not affecting is with implicit module map search. The other option is to pass a bunch of unnecessary -fmodule-map-file=<path> arguments on the command-line, in which case the client should probably not give those to Clang anyway.

This makes serialization of explicit modules faster, mostly due to reduced file system traffic.

@jansvoboda11 jansvoboda11 marked this pull request as ready for review April 8, 2024 16:19
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:modules C++20 modules and Clang Header Modules labels Apr 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 8, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Author: Jan Svoboda (jansvoboda11)

Changes

When writing out a PCM, we compute the set of module maps that did affect the compilation and we strip the rest to make the output independent of them. The most common way to read a module map that is not affecting is with implicit module map search. The other option is to pass a bunch of unnecessary -fmodule-map-file=&lt;path&gt; arguments on the command-line, in which case the client should probably not give those to Clang anyway.

This makes serialization of explicit modules faster, mostly due to reduced file system traffic.


Full diff: https://github.com/llvm/llvm-project/pull/87849.diff

1 Files Affected:

  • (modified) clang/lib/Serialization/ASTWriter.cpp (+17-4)
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index ba6a8a5e16e4e7..e1e22017595518 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -161,8 +161,13 @@ static TypeCode getTypeCodeForTypeClass(Type::TypeClass id) {
 
 namespace {
 
-std::set<const FileEntry *> GetAffectingModuleMaps(const Preprocessor &PP,
-                                                   Module *RootModule) {
+std::optional<std::set<const FileEntry *>>
+GetAffectingModuleMaps(const Preprocessor &PP, Module *RootModule) {
+  // Without implicit module map search, there's no good reason to know about
+  // any module maps that are not affecting.
+  if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ImplicitModuleMaps)
+    return std::nullopt;
+
   SmallVector<const Module *> ModulesToProcess{RootModule};
 
   const HeaderSearch &HS = PP.getHeaderSearchInfo();
@@ -4733,8 +4738,16 @@ void ASTWriter::computeNonAffectingInputFiles() {
     if (!Cache->OrigEntry)
       continue;
 
-    if (!isModuleMap(File.getFileCharacteristic()) ||
-        llvm::is_contained(AffectingModuleMaps, *Cache->OrigEntry))
+    // Don't prune anything other than module maps.
+    if (!isModuleMap(File.getFileCharacteristic()))
+      continue;
+
+    // Don't prune module maps if all are guaranteed to be affecting.
+    if (!AffectingModuleMaps)
+      continue;
+
+    // Don't prune module maps that are affecting.
+    if (llvm::is_contained(*AffectingModuleMaps, *Cache->OrigEntry))
       continue;
 
     IsSLocAffecting[I] = false;

@jansvoboda11
Copy link
Contributor Author

CC @ilyakuteev

@jansvoboda11 jansvoboda11 merged commit 51786eb into llvm:main Apr 10, 2024
9 checks passed
@jansvoboda11 jansvoboda11 deleted the affecting-mms-only-after-search branch April 10, 2024 16:08
jansvoboda11 added a commit to apple/llvm-project that referenced this pull request Apr 10, 2024
…rch (llvm#87849)

When writing out a PCM, we compute the set of module maps that did
affect the compilation and we strip the rest to make the output
independent of them. The most common way to read a module map that is
not affecting is with implicit module map search. The other option is to
pass a bunch of unnecessary `-fmodule-map-file=<path>` arguments on the
command-line, in which case the client should probably not give those to
Clang anyway.

This makes serialization of explicit modules faster, mostly due to
reduced file system traffic.

(cherry picked from commit 51786eb)
Comment on lines +166 to +169
// Without implicit module map search, there's no good reason to know about
// any module maps that are not affecting.
if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ImplicitModuleMaps)
return std::nullopt;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct -- we can know about non-affecting module map files because they were specified on the command line using -fmodule-map-file= and either weren't actually used or (as happens in our case) we got the same information from a PCM file and so didn't need the module map file.

I think this check is a regression; can it be removed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see that adding this check was the whole point of the patch. I don't think this works -- it's not feasible for the build system to know whether a module map file would affect a compilation or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware that you're passing unused module map files to the compiler. In that case I can introduce a separate flag to control this functionality and only disable the computation of affecting module maps for our flavor of explicit modules where this is a no-op. Sounds good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds great, thank you. In our case, it's not that the module map files are unused, it's that we specify the same -fmodule-map-file= flag to all of our compilations, and so we usually get the information from a .pcm file rather than from the module map file, and it turns out that pruning out the repeated module maps make a rather large difference to our source location address space usage :)

All that said, we have some other ideas for how to address the problems we're seeing with source location address space usage. Maybe we can get to a point where we're not relying on this at all -- I think this patch is probably the right behavior, even though I think it's contributing to a regression for us, so I think this new behavior should be the default, and we can clean up the flag once we don't need it any more.

jansvoboda11 added a commit that referenced this pull request Apr 19, 2024
Pruning non-affecting module maps is useful even when passing module
maps explicitly via `-fmodule-map-file=<path>`. For this situation, this
patch reinstates the behavior we had prior to #87849. For the situation
where the explicit module map file arguments were generated by the
dependency scanner (which already pruned the non-affecting ones), this
patch introduces new `-cc1` flag
`-fno-modules-prune-non-affecting-module-map-files` that avoids the
extra work.
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 21, 2024
…9428)

Pruning non-affecting module maps is useful even when passing module
maps explicitly via `-fmodule-map-file=<path>`. For this situation, this
patch reinstates the behavior we had prior to llvm#87849. For the situation
where the explicit module map file arguments were generated by the
dependency scanner (which already pruned the non-affecting ones), this
patch introduces new `-cc1` flag
`-fno-modules-prune-non-affecting-module-map-files` that avoids the
extra work.
jansvoboda11 added a commit to apple/llvm-project that referenced this pull request May 1, 2024
…9428)

Pruning non-affecting module maps is useful even when passing module
maps explicitly via `-fmodule-map-file=<path>`. For this situation, this
patch reinstates the behavior we had prior to llvm#87849. For the situation
where the explicit module map file arguments were generated by the
dependency scanner (which already pruned the non-affecting ones), this
patch introduces new `-cc1` flag
`-fno-modules-prune-non-affecting-module-map-files` that avoids the
extra work.

(cherry picked from commit 3ea5dff)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants