Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][ScanDeps] Allow PCHs to have different VFS overlays #82294

Merged
merged 1 commit into from
Feb 24, 2024

Conversation

Bigcheese
Copy link
Contributor

It turns out it's not that uncommon for real code to pass a different set of VFSs while building a PCH than while using the PCH. This can cause problems as seen in test/ClangScanDeps/optimize-vfs-pch.m. If you scan compile-commands-tu-no-vfs-error.json without -Werror and run the resulting commands, Clang will emit a fatal error while trying to emit a note saying that it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are part of an included PCH, as the same issue can occur in a module we are building if we drop VFSs. This has to be textual because we have no guarantee the PCH had the same list of VFSs as the current TU.

This uses the PrebuiltModuleListener to collect VFSOverlayFiles instead of trying to extract it out of a serialization::ModuleFile each time it's needed. There's not a great way to just store a pointer to the list of strings in the serialized AST.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:adt labels Feb 20, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 20, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-adt

Author: Michael Spencer (Bigcheese)

Changes

It turns out it's not that uncommon for real code to pass a different set of VFSs while building a PCH than while using the PCH. This can cause problems as seen in test/ClangScanDeps/optimize-vfs-pch.m. If you scan compile-commands-tu-no-vfs-error.json without -Werror and run the resulting commands, Clang will emit a fatal error while trying to emit a note saying that it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are part of an included PCH, as the same issue can occur in a module we are building if we drop VFSs. This has to be textual because we have no guarantee the PCH had the same list of VFSs as the current TU.

This uses the PrebuiltModuleListener to collect VFSOverlayFiles instead of trying to extract it out of a serialization::ModuleFile each time it's needed. There's not a great way to just store a pointer to the list of strings in the serialized AST.


Patch is 21.12 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82294.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticSerializationKinds.td (+3-1)
  • (modified) clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h (+5)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp (+47-11)
  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+26-8)
  • (modified) clang/test/ClangScanDeps/optimize-vfs-pch.m (+105-9)
  • (modified) llvm/include/llvm/ADT/StringSet.h (+4)
diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 4c4659ed517e0a..eb27de5921d6a1 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,7 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
-def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS overlay files than are currently in use">;
+def warn_pch_vfsoverlay_mismatch : Warning<
+  "PCH was compiled with different VFS overlay files than are currently in use">,
+  InGroup<DiagGroup<"pch-vfs-diff">>;
 def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 has the following VFS overlays:\n%1">;
 def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 has no VFS overlays">;
 
diff --git a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 13ad2530864927..081899cc2c8503 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -149,6 +149,8 @@ struct ModuleDeps {
       BuildInfo;
 };
 
+using PrebuiltModuleVFSMapT = llvm::StringMap<llvm::StringSet<>>;
+
 class ModuleDepCollector;
 
 /// Callback that records textual includes and direct modular includes/imports
@@ -214,6 +216,7 @@ class ModuleDepCollector final : public DependencyCollector {
                      CompilerInstance &ScanInstance, DependencyConsumer &C,
                      DependencyActionController &Controller,
                      CompilerInvocation OriginalCI,
+                     PrebuiltModuleVFSMapT PrebuiltModuleVFSMap,
                      ScanningOptimizations OptimizeArgs, bool EagerLoadModules,
                      bool IsStdModuleP1689Format);
 
@@ -233,6 +236,8 @@ class ModuleDepCollector final : public DependencyCollector {
   DependencyConsumer &Consumer;
   /// Callbacks for computing dependency information.
   DependencyActionController &Controller;
+  /// Mapping from prebuilt AST files to their sorted list of VFS overlay files.
+  PrebuiltModuleVFSMapT PrebuiltModuleVFSMap;
   /// Path to the main source file.
   std::string MainFile;
   /// Hash identifying the compilation conditions of the current TU.
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..b252463a08b8d7 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -24,6 +24,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningService.h"
 #include "clang/Tooling/DependencyScanning/ModuleDepCollector.h"
 #include "clang/Tooling/Tooling.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Host.h"
@@ -67,7 +68,7 @@ static bool checkHeaderSearchPaths(const HeaderSearchOptions &HSOpts,
   if (LangOpts.Modules) {
     if (HSOpts.VFSOverlayFiles != ExistingHSOpts.VFSOverlayFiles) {
       if (Diags) {
-        Diags->Report(diag::err_pch_vfsoverlay_mismatch);
+        Diags->Report(diag::warn_pch_vfsoverlay_mismatch);
         auto VFSNote = [&](int Type, ArrayRef<std::string> VFSOverlays) {
           if (VFSOverlays.empty()) {
             Diags->Report(diag::note_pch_vfsoverlay_empty) << Type;
@@ -79,7 +80,6 @@ static bool checkHeaderSearchPaths(const HeaderSearchOptions &HSOpts,
         VFSNote(0, HSOpts.VFSOverlayFiles);
         VFSNote(1, ExistingHSOpts.VFSOverlayFiles);
       }
-      return true;
     }
   }
   return false;
@@ -93,10 +93,12 @@ class PrebuiltModuleListener : public ASTReaderListener {
 public:
   PrebuiltModuleListener(PrebuiltModuleFilesT &PrebuiltModuleFiles,
                          llvm::SmallVector<std::string> &NewModuleFiles,
+                         PrebuiltModuleVFSMapT &PrebuiltModuleVFSMap,
                          const HeaderSearchOptions &HSOpts,
                          const LangOptions &LangOpts, DiagnosticsEngine &Diags)
       : PrebuiltModuleFiles(PrebuiltModuleFiles),
-        NewModuleFiles(NewModuleFiles), ExistingHSOpts(HSOpts),
+        NewModuleFiles(NewModuleFiles),
+        PrebuiltModuleVFSMap(PrebuiltModuleVFSMap), ExistingHSOpts(HSOpts),
         ExistingLangOpts(LangOpts), Diags(Diags) {}
 
   bool needsImportVisitation() const override { return true; }
@@ -106,8 +108,16 @@ class PrebuiltModuleListener : public ASTReaderListener {
       NewModuleFiles.push_back(Filename.str());
   }
 
+  void visitModuleFile(StringRef Filename,
+                       serialization::ModuleKind Kind) override {
+    CurrentFile = Filename;
+  }
+
   bool ReadHeaderSearchPaths(const HeaderSearchOptions &HSOpts,
                              bool Complain) override {
+    std::vector<std::string> VFSOverlayFiles = HSOpts.VFSOverlayFiles;
+    PrebuiltModuleVFSMap.insert(
+        {CurrentFile, llvm::StringSet<>(VFSOverlayFiles)});
     return checkHeaderSearchPaths(
         HSOpts, ExistingHSOpts, Complain ? &Diags : nullptr, ExistingLangOpts);
   }
@@ -115,9 +125,11 @@ class PrebuiltModuleListener : public ASTReaderListener {
 private:
   PrebuiltModuleFilesT &PrebuiltModuleFiles;
   llvm::SmallVector<std::string> &NewModuleFiles;
+  PrebuiltModuleVFSMapT &PrebuiltModuleVFSMap;
   const HeaderSearchOptions &ExistingHSOpts;
   const LangOptions &ExistingLangOpts;
   DiagnosticsEngine &Diags;
+  std::string CurrentFile;
 };
 
 /// Visit the given prebuilt module and collect all of the modules it
@@ -125,12 +137,16 @@ class PrebuiltModuleListener : public ASTReaderListener {
 static bool visitPrebuiltModule(StringRef PrebuiltModuleFilename,
                                 CompilerInstance &CI,
                                 PrebuiltModuleFilesT &ModuleFiles,
+                                PrebuiltModuleVFSMapT &PrebuiltModuleVFSMap,
                                 DiagnosticsEngine &Diags) {
   // List of module files to be processed.
   llvm::SmallVector<std::string> Worklist;
-  PrebuiltModuleListener Listener(
-      ModuleFiles, Worklist, CI.getHeaderSearchOpts(), CI.getLangOpts(), Diags);
+  PrebuiltModuleListener Listener(ModuleFiles, Worklist, PrebuiltModuleVFSMap,
+                                  CI.getHeaderSearchOpts(), CI.getLangOpts(),
+                                  Diags);
 
+  Listener.visitModuleFile(PrebuiltModuleFilename,
+                           serialization::MK_ExplicitModule);
   if (ASTReader::readASTFileControlBlock(
           PrebuiltModuleFilename, CI.getFileManager(), CI.getModuleCache(),
           CI.getPCHContainerReader(),
@@ -139,6 +155,7 @@ static bool visitPrebuiltModule(StringRef PrebuiltModuleFilename,
     return true;
 
   while (!Worklist.empty()) {
+    Listener.visitModuleFile(Worklist.back(), serialization::MK_ExplicitModule);
     if (ASTReader::readASTFileControlBlock(
             Worklist.pop_back_val(), CI.getFileManager(), CI.getModuleCache(),
             CI.getPCHContainerReader(),
@@ -175,8 +192,19 @@ static void sanitizeDiagOpts(DiagnosticOptions &DiagOpts) {
   DiagOpts.ShowCarets = false;
   // Don't write out diagnostic file.
   DiagOpts.DiagnosticSerializationFile.clear();
-  // Don't emit warnings as errors (and all other warnings too).
-  DiagOpts.IgnoreWarnings = true;
+  // Don't emit warnings except for scanning specific warnings.
+  // TODO: It would be useful to add a more principled way to ignore all
+  //       warnings that come from source code. The issue is that we need to
+  //       ignore warnings that could be surpressed by
+  //       `#pragma clang diagnostic`, while still allowing some scanning
+  //       warnings for things we're not ready to turn into errors yet.
+  //       See `test/ClangScanDeps/diagnostic-pragmas.c` for an example.
+  llvm::erase_if(DiagOpts.Warnings, [](StringRef Warning) {
+    return llvm::StringSwitch<bool>(Warning)
+        .Cases("pch-vfs-diff", "error=pch-vfs-diff", false)
+        .StartsWith("no-error=", false)
+        .Default(true);
+  });
 }
 
 /// A clang tool that runs the preprocessor in a mode that's optimized for
@@ -226,6 +254,10 @@ class DependencyScanningAction : public tooling::ToolAction {
     if (!ScanInstance.hasDiagnostics())
       return false;
 
+    // Some DiagnosticConsumers require that finish() is called.
+    auto DiagConsumerFinisher =
+        llvm::make_scope_exit([DiagConsumer]() { DiagConsumer->finish(); });
+
     ScanInstance.getPreprocessorOpts().AllowPCHWithDifferentModulesCachePath =
         true;
 
@@ -233,7 +265,8 @@ class DependencyScanningAction : public tooling::ToolAction {
     ScanInstance.getFrontendOpts().UseGlobalModuleIndex = false;
     ScanInstance.getFrontendOpts().ModulesShareFileManager = false;
     ScanInstance.getHeaderSearchOpts().ModuleFormat = "raw";
-    ScanInstance.getHeaderSearchOpts().ModulesIncludeVFSUsage = true;
+    ScanInstance.getHeaderSearchOpts().ModulesIncludeVFSUsage =
+        any(OptimizeArgs & ScanningOptimizations::VFS);
 
     ScanInstance.setFileManager(FileMgr);
     // Support for virtual file system overlays.
@@ -246,12 +279,13 @@ class DependencyScanningAction : public tooling::ToolAction {
     // Store the list of prebuilt module files into header search options. This
     // will prevent the implicit build to create duplicate modules and will
     // force reuse of the existing prebuilt module files instead.
+    PrebuiltModuleVFSMapT PrebuiltModuleVFSMap;
     if (!ScanInstance.getPreprocessorOpts().ImplicitPCHInclude.empty())
       if (visitPrebuiltModule(
               ScanInstance.getPreprocessorOpts().ImplicitPCHInclude,
               ScanInstance,
               ScanInstance.getHeaderSearchOpts().PrebuiltModuleFiles,
-              ScanInstance.getDiagnostics()))
+              PrebuiltModuleVFSMap, ScanInstance.getDiagnostics()))
         return false;
 
     // Use the dependency scanning optimized file system if requested to do so.
@@ -295,8 +329,8 @@ class DependencyScanningAction : public tooling::ToolAction {
     case ScanningOutputFormat::Full:
       MDC = std::make_shared<ModuleDepCollector>(
           std::move(Opts), ScanInstance, Consumer, Controller,
-          OriginalInvocation, OptimizeArgs, EagerLoadModules,
-          Format == ScanningOutputFormat::P1689);
+          OriginalInvocation, std::move(PrebuiltModuleVFSMap), OptimizeArgs,
+          EagerLoadModules, Format == ScanningOutputFormat::P1689);
       ScanInstance.addDependencyCollector(MDC);
       break;
     }
@@ -325,6 +359,8 @@ class DependencyScanningAction : public tooling::ToolAction {
     if (ScanInstance.getDiagnostics().hasErrorOccurred())
       return false;
 
+    // Each action is responsible for calling finish.
+    DiagConsumerFinisher.release();
     const bool Result = ScanInstance.ExecuteAction(*Action);
 
     if (Result)
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 5a9e563c2d5b26..eb5c50c35428fe 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -29,10 +29,11 @@ const std::vector<std::string> &ModuleDeps::getBuildArguments() {
   return std::get<std::vector<std::string>>(BuildInfo);
 }
 
-static void optimizeHeaderSearchOpts(HeaderSearchOptions &Opts,
-                                     ASTReader &Reader,
-                                     const serialization::ModuleFile &MF,
-                                     ScanningOptimizations OptimizeArgs) {
+static void
+optimizeHeaderSearchOpts(HeaderSearchOptions &Opts, ASTReader &Reader,
+                         const serialization::ModuleFile &MF,
+                         const PrebuiltModuleVFSMapT &PrebuiltModuleVFSMap,
+                         ScanningOptimizations OptimizeArgs) {
   if (any(OptimizeArgs & ScanningOptimizations::HeaderSearch)) {
     // Only preserve search paths that were used during the dependency scan.
     std::vector<HeaderSearchOptions::Entry> Entries;
@@ -65,11 +66,25 @@ static void optimizeHeaderSearchOpts(HeaderSearchOptions &Opts,
     llvm::DenseSet<const serialization::ModuleFile *> Visited;
     std::function<void(const serialization::ModuleFile *)> VisitMF =
         [&](const serialization::ModuleFile *MF) {
-          VFSUsage |= MF->VFSUsage;
           Visited.insert(MF);
-          for (const serialization::ModuleFile *Import : MF->Imports)
-            if (!Visited.contains(Import))
-              VisitMF(Import);
+          if (MF->Kind == serialization::MK_ImplicitModule) {
+            VFSUsage |= MF->VFSUsage;
+            // We only need to recurse into implicit modules. Other module types
+            // will have the correct set of VFSs for anything they depend on.
+            for (const serialization::ModuleFile *Import : MF->Imports)
+              if (!Visited.contains(Import))
+                VisitMF(Import);
+          } else {
+            // This is not an implicitly built module, so it may have different
+            // VFS options. Fall back to a string comparison instead.
+            auto VFSMap = PrebuiltModuleVFSMap.find(MF->FileName);
+            if (VFSMap == PrebuiltModuleVFSMap.end())
+              return;
+            for (std::size_t I = 0, E = VFSOverlayFiles.size(); I != E; ++I) {
+              if (VFSMap->second.contains(VFSOverlayFiles[I]))
+                VFSUsage[I] = true;
+            }
+          }
         };
     VisitMF(&MF);
 
@@ -596,6 +611,7 @@ ModuleDepCollectorPP::handleTopLevelModule(const Module *M) {
                                         ScanningOptimizations::VFS)))
               optimizeHeaderSearchOpts(BuildInvocation.getMutHeaderSearchOpts(),
                                        *MDC.ScanInstance.getASTReader(), *MF,
+                                       MDC.PrebuiltModuleVFSMap,
                                        MDC.OptimizeArgs);
             if (any(MDC.OptimizeArgs & ScanningOptimizations::SystemWarnings))
               optimizeDiagnosticOpts(
@@ -697,9 +713,11 @@ ModuleDepCollector::ModuleDepCollector(
     std::unique_ptr<DependencyOutputOptions> Opts,
     CompilerInstance &ScanInstance, DependencyConsumer &C,
     DependencyActionController &Controller, CompilerInvocation OriginalCI,
+    PrebuiltModuleVFSMapT PrebuiltModuleVFSMap,
     ScanningOptimizations OptimizeArgs, bool EagerLoadModules,
     bool IsStdModuleP1689Format)
     : ScanInstance(ScanInstance), Consumer(C), Controller(Controller),
+      PrebuiltModuleVFSMap(std::move(PrebuiltModuleVFSMap)),
       Opts(std::move(Opts)),
       CommonInvocation(
           makeCommonInvocationForModuleBuild(std::move(OriginalCI))),
diff --git a/clang/test/ClangScanDeps/optimize-vfs-pch.m b/clang/test/ClangScanDeps/optimize-vfs-pch.m
index e6acb73e1dd343..0b5cb08d365eee 100644
--- a/clang/test/ClangScanDeps/optimize-vfs-pch.m
+++ b/clang/test/ClangScanDeps/optimize-vfs-pch.m
@@ -4,7 +4,8 @@
 // RUN: split-file %s %t
 // RUN: sed -e "s|DIR|%/t|g" %t/build/compile-commands-pch.json.in > %t/build/compile-commands-pch.json
 // RUN: sed -e "s|DIR|%/t|g" %t/build/compile-commands-tu.json.in > %t/build/compile-commands-tu.json
-// RUN: sed -e "s|DIR|%/t|g" %t/build/compile-commands-tu-no-vfs.json.in > %t/build/compile-commands-tu-no-vfs.json
+// RUN: sed -e "s|DIR|%/t|g" %t/build/compile-commands-tu-no-vfs-error.json.in > %t/build/compile-commands-tu-no-vfs-error.json
+// RUN: sed -e "s|DIR|%/t|g" %t/build/compile-commands-tu1.json.in > %t/build/compile-commands-tu1.json
 // RUN: sed -e "s|DIR|%/t|g" %t/build/pch-overlay.yaml.in > %t/build/pch-overlay.yaml
 
 // RUN: clang-scan-deps -compilation-database %t/build/compile-commands-pch.json \
@@ -23,11 +24,66 @@
 // RUN: %clang @%t/C.rsp
 // RUN: %clang @%t/tu.rsp
 
-// RUN: not clang-scan-deps -compilation-database %t/build/compile-commands-tu-no-vfs.json \
-// RUN:   -j 1 -format experimental-full --optimize-args=vfs,header-search 2>&1 | FileCheck %s
-
-// CHECK: error: PCH was compiled with different VFS overlay files than are currently in use
-// CHECK: note: current translation unit has no VFS overlays
+// RUN: not clang-scan-deps -compilation-database %t/build/compile-commands-tu-no-vfs-error.json \
+// RUN:   -j 1 -format experimental-full --optimize-args=vfs,header-search 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s
+
+// CHECK-ERROR: error: PCH was compiled with different VFS overlay files than are currently in use
+// CHECK-ERROR: note: current translation unit has no VFS overlays
+
+// Next test is to verify that a module that doesn't use the VFS, that depends
+// on the PCH's A, which does use the VFS, still records that it needs the VFS.
+// This avoids a fatal error when emitting diagnostics.
+
+// RUN: clang-scan-deps -compilation-database %t/build/compile-commands-tu1.json \
+// RUN:   -j 1 -format experimental-full --optimize-args=vfs,header-search > %t/tu1-deps.db
+// RUN: %deps-to-rsp %t/tu1-deps.db --tu-index=0 > %t/tu1.rsp
+// Reuse existing B
+// RUN: %deps-to-rsp %t/tu1-deps.db --module-name=E > %t/E.rsp
+// RUN: %deps-to-rsp %t/tu1-deps.db --module-name=D > %t/D.rsp
+// The build of D depends on B which depend on the prebuilt A. D will only build
+// if it has A's VFS, as it needs to emit a diagnostic showing the content of A.
+// RUN: %clang @%t/E.rsp
+// RUN: %clang @%t/D.rsp -verify
+// RUN: %clang @%t/tu1.rsp
+// RUN: cat %t/tu1-deps.db | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%/t
+
+// Check that D has the overlay, but E doesn't.
+// CHECK:      {
+// CHECK-NEXT:   "modules": [
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "clang-module-deps": [
+// CHECK-NEXT:         {
+// CHECK-NEXT:           "context-hash": "{{.*}}",
+// CHECK-NEXT:           "module-name": "E"
+// CHECK-NEXT:         }
+// CHECK-NEXT:       ],
+// CHECK-NEXT:       "clang-modulemap-file": "[[PREFIX]]/modules/D/module.modulemap",
+// CHECK-NEXT:       "command-line": [
+// CHECK:              "-ivfsoverlay"
+// CHECK-NEXT:         "[[PREFIX]]/build/pch-overlay.yaml"
+// CHECK:            ],
+// CHECK-NEXT:       "context-hash": "{{.*}}",
+// CHECK-NEXT:       "file-deps": [
+// CHECK-NEXT:         "{{.*}}"
+// CHECK-NEXT:         "{{.*}}"
+// CHECK-NEXT:         "{{.*}}"
+// CHECK-NEXT:         "{{.*}}"
+// CHECK-NEXT:       ],
+// CHECK:            "name": "D"
+// CHECK-NEXT:     },
+// CHECK-NEXT:     {
+// CHECK-NEXT:       "clang-module-deps": [],
+// CHECK-NEXT:       "clang-modulemap-file": "[[PREFIX]]/modules/E/module.modulemap",
+// CHECK-NEXT:       "command-line": [
+// CHECK-NOT:          "-ivfsoverlay"
+// CHECK:            ],
+// CHECK-NEXT:       "context-hash": "{{.*}}",
+// CHECK-NEXT:       "file-deps": [
+// CHECK-NEXT:         "{{.*}}"
+// CHECK-NEXT:         "{{.*}}"
+// CHECK-NEXT:       ],
+// CHECK:            "name": "E"
+// CHECK-NEXT:     }
 
 //--- build/compile-commands-pch.json.in
 
@@ -49,16 +105,26 @@
 }
 ]
 
-//--- build/compile-commands-tu-no-vfs.json.in
+//--- build/compile-commands-tu-no-vfs-error.json.in
 
 [
 {
   "directory": "DIR",
-  "command": "clang -fsyntax-only DIR/tu.m -I DIR/modules/A -I DIR/modules/B -I DIR/modules/C -fmodules -fimplicit-module-maps -fmodules-cache-path=DIR/cache -include DIR/pch.h -o DIR/tu.o",
+  "command": "clang -Wpch-vfs-diff -Werror=pch-vfs-diff -fsyntax-only DIR/tu.m -I DIR/modules/A -I DIR/modules/B -I DIR/modules/C -fmodules -fimplicit-module-maps -fmodules-cache-path=DIR/cache -include DIR/pch.h -o DIR/tu.o",
   "file": "DIR/tu.m"
 }
 ]
 
+//--- build/compile-commands-tu1.json.in
+
+[
+{
+  "directory": "DIR",
+  "command": "clang -fsyntax-only DIR/tu1.m -I DIR/modules/B -I DIR/modules/D -I DIR/modules/E -fmodules -fimplicit-module-maps -fmodules-cache-path=DIR/cache -include DIR/pch.h -o DIR/tu1.o -ivfsoverlay DIR/build/pch-overlay.yaml",
+  "file": "DIR/tu1.m"
+}
+]
+
 //--- build/pch-overlay.yaml.in
 
 {
@@ -95,7 +161,7 @@
 
 //--- build/A.h
 
-typedef int A_t;
+typedef int A_t __attribute__((deprec...
[truncated]

// VFS options. Fall back to a string comparison instead.
auto VFSMap = PrebuiltModuleVFSMap.find(MF->FileName);
if (VFSMap == PrebuiltModuleVFSMap.end())
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This almost makes it seem like it's okay to import a non-implicit module that we didn't visit when dealing with the PCH. I think it would be good to add a sanity check here by having an entry for each PCH dependency (that will map to empty set if no VFS overlay files were used) and assert here that VFSMap != PrebuiltModuleVFSMap.end().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's also possible that you used an explicit module in the original driver command, or prebuilt module path. The idea of silently ignoring unknown modules here was to preserve the existing behavior of ignoring them. I would eventually like to make this an error, but doing this works in most cases now, so I don't want to do that yet. It would probably be good to make it a warning in this patch though, I don't expect this return to fire unless something weird is going on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like this implemented in this PR, but not mandatory.

Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, this patch doesn't attempt to solve the case where Clang can crash when the VFS overlay files are different between the PCH and the TU, since that's existing behavior. Correct?

@@ -67,7 +68,7 @@ static bool checkHeaderSearchPaths(const HeaderSearchOptions &HSOpts,
if (LangOpts.Modules) {
if (HSOpts.VFSOverlayFiles != ExistingHSOpts.VFSOverlayFiles) {
if (Diags) {
Diags->Report(diag::err_pch_vfsoverlay_mismatch);
Diags->Report(diag::warn_pch_vfsoverlay_mismatch);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is applicable outside of the scanner. Should we move this into the generic PCH loading code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, this is currently disabled by default, but is useful for normal PCH too. It would be an issue if we ever want to start changing the original command lines too though (to increase cache hits for normal TUs), there you only want to warn when scanning, not in the actual build.

@Bigcheese
Copy link
Contributor Author

Just to clarify, this patch doesn't attempt to solve the case where Clang can crash when the VFS overlay files are different between the PCH and the TU, since that's existing behavior. Correct?

Yep, this patch still allows that to happen in cases where it would today.

Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

It turns out it's not that uncommon for real code to pass a different
set of VFSs while building a PCH than while using the PCH. This can
cause problems as seen in `test/ClangScanDeps/optimize-vfs-pch.m`. If
you scan `compile-commands-tu-no-vfs-error.json` without -Werror and
run the resulting commands, Clang will emit a fatal error while trying
to emit a note saying that it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are
part of an included PCH, as the same issue can occur in a module we
are building if we drop VFSs. This has to be textual because we have
no guarantee the PCH had the same list of VFSs as the current TU.
@Bigcheese
Copy link
Contributor Author

CI failure is a preexisting Flang test failure and a preexisting trailing whitespace issue.

@Bigcheese Bigcheese merged commit de3b2c2 into llvm:main Feb 24, 2024
3 of 4 checks passed
// warnings that come from source code. The issue is that we need to
// ignore warnings that could be surpressed by
// `#pragma clang diagnostic`, while still allowing some scanning
// warnings for things we're not ready to turn into errors yet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about removing IngoreWarnings = true. There are lots of default warnings we may not want. Also this would allow #pragma clang diagnostic to promote arbitrary diagnostics to errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanner never sees #pragma clang diagnostic, so there's no issue with code that uses that to turn warnings on. The original issue was with warnings getting turned off via #pragma clang diagnostic, but the new code removes all warnings and Werrors, so you're just left with default warnings.

The goal here was to keep driver warnings (which are lost otherwise) and allow us to have dedicated scanner warnings. I do think we want more control over this, possibly add a scanner bit to diagnostics so we can be explicit about which warnings we expect from the scanner, but I think this change is fine for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanner never sees #pragma clang diagnostic, so there's no issue with code that uses that to turn warnings on.

Ah sorry, I forgot we skipped over most pragmas.

so you're just left with default warnings.

The goal here was to keep driver warnings (which are lost otherwise) and allow us to have dedicated scanner warnings. I do think we want more control over this, possibly add a scanner bit to diagnostics so we can be explicit about which warnings we expect from the scanner, but I think this change is fine for now.

This goal makes some sense to me, but I'm not sure that using the default warnings are a good idea. The default warnings can just as easily cause us to emit a driver warning that the user was explicitly trying to hide.

@Bigcheese Bigcheese deleted the dev/vfs-pch-fix branch March 5, 2024 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:adt
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants