Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] Add -fhermetic-module-files #98083

Merged
merged 1 commit into from
Jul 11, 2024
Merged

[flang] Add -fhermetic-module-files #98083

merged 1 commit into from
Jul 11, 2024

Conversation

klausler
Copy link
Contributor

@klausler klausler commented Jul 8, 2024

Module files emitted by this Fortran compiler are valid Fortran source files. Symbols that are USE-associated into modules are represented in their module files with USE statements and special comments with hash codes in them to ensure that those USE statements resolve to the same modules that were used to build the module when its module file was generated.

This scheme prevents unchecked module file growth in large applications by not emitting USE-associated symbols redundantly. This problem can be especially bad when derived type definitions must be repeated in the module files of their clients, and the clients of those modules, and so on. However, this scheme has the disadvantage that clients of modules must be compiled with dependent modules in the module search path.

This new -fhermetic-module-files option causes module file output to be free of dependences on any non-intrinsic module files; dependent modules are instead emitted as part of the module file, rather than being USE-associated. It is intended for top level library module files that are shipped with binary libraries when it is not convenient to collect and ship their dependent module files as well.

Fixes #97398.

Module files emitted by this Fortran compiler are valid Fortran source
files.  Symbols that are USE-associated into modules are represented
in their module files with USE statements and special comments with
hash codes in them to ensure that those USE statements resolve to
the same modules that were used to build the module when its module
file was generated.

This scheme prevents unchecked module file growth in large applications
by not emitting USE-associated symbols redundantly.  This problem can
be especially bad when derived type definitions must be repeated in
the module files of their clients, and the clients of those modules,
and so on.  However, this scheme has the disadvantage that clients of
modules must be compiled with dependent modules in the module search path.

This new -fhermetic-module-files option causes module file output to
be free of dependences on any non-intrinsic module files; dependent
modules are instead emitted as part of the module file, rather than
being USE-associated.  It is intended for top level library module files
that are shipped with binary libraries when it is not convenient to collect
and ship their dependent module files as well.

Fixes llvm#97398.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' flang:driver flang Flang issues not falling into any other category flang:semantics labels Jul 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 8, 2024

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-flang-semantics
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-flang-driver

Author: Peter Klausler (klausler)

Changes

Module files emitted by this Fortran compiler are valid Fortran source files. Symbols that are USE-associated into modules are represented in their module files with USE statements and special comments with hash codes in them to ensure that those USE statements resolve to the same modules that were used to build the module when its module file was generated.

This scheme prevents unchecked module file growth in large applications by not emitting USE-associated symbols redundantly. This problem can be especially bad when derived type definitions must be repeated in the module files of their clients, and the clients of those modules, and so on. However, this scheme has the disadvantage that clients of modules must be compiled with dependent modules in the module search path.

This new -fhermetic-module-files option causes module file output to be free of dependences on any non-intrinsic module files; dependent modules are instead emitted as part of the module file, rather than being USE-associated. It is intended for top level library module files that are shipped with binary libraries when it is not convenient to collect and ship their dependent module files as well.

Fixes #97398.


Full diff: https://github.com/llvm/llvm-project/pull/98083.diff

10 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+3)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+20-12)
  • (modified) flang/include/flang/Frontend/CompilerInvocation.h (+9)
  • (modified) flang/include/flang/Semantics/semantics.h (+6-4)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+5)
  • (modified) flang/lib/Frontend/FrontendAction.cpp (+5-2)
  • (modified) flang/lib/Semantics/mod-file.cpp (+32-10)
  • (modified) flang/lib/Semantics/mod-file.h (+6-1)
  • (modified) flang/lib/Semantics/semantics.cpp (+3-1)
  • (added) flang/test/Semantics/modfile65.f90 (+53)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 1c2b8cfeef6ce..f3488c27ca153 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6667,6 +6667,9 @@ defm stack_arrays : BoolOptionWithoutMarshalling<"f", "stack-arrays",
 defm loop_versioning : BoolOptionWithoutMarshalling<"f", "version-loops-for-stride",
   PosFlag<SetTrue, [], [ClangOption], "Create unit-strided versions of loops">,
    NegFlag<SetFalse, [], [ClangOption], "Do not create unit-strided loops (default)">>;
+
+def fhermetic_module_files : Flag<["-"], "fhermetic-module-files">, Group<f_Group>,
+  HelpText<"Emit hermetic module files (no nested USE association)">;
 } // let Visibility = [FC1Option, FlangOption]
 
 def J : JoinedOrSeparate<["-"], "J">,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index 42b45dba2bd31..f9d4704dbb7bd 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -35,18 +35,26 @@ static void addDashXForInput(const ArgList &Args, const InputInfo &Input,
 
 void Flang::addFortranDialectOptions(const ArgList &Args,
                                      ArgStringList &CmdArgs) const {
-  Args.addAllArgs(
-      CmdArgs, {options::OPT_ffixed_form, options::OPT_ffree_form,
-                options::OPT_ffixed_line_length_EQ, options::OPT_fopenacc,
-                options::OPT_finput_charset_EQ, options::OPT_fimplicit_none,
-                options::OPT_fno_implicit_none, options::OPT_fbackslash,
-                options::OPT_fno_backslash, options::OPT_flogical_abbreviations,
-                options::OPT_fno_logical_abbreviations,
-                options::OPT_fxor_operator, options::OPT_fno_xor_operator,
-                options::OPT_falternative_parameter_statement,
-                options::OPT_fdefault_real_8, options::OPT_fdefault_integer_8,
-                options::OPT_fdefault_double_8, options::OPT_flarge_sizes,
-                options::OPT_fno_automatic});
+  Args.addAllArgs(CmdArgs, {options::OPT_ffixed_form,
+                            options::OPT_ffree_form,
+                            options::OPT_ffixed_line_length_EQ,
+                            options::OPT_fopenacc,
+                            options::OPT_finput_charset_EQ,
+                            options::OPT_fimplicit_none,
+                            options::OPT_fno_implicit_none,
+                            options::OPT_fbackslash,
+                            options::OPT_fno_backslash,
+                            options::OPT_flogical_abbreviations,
+                            options::OPT_fno_logical_abbreviations,
+                            options::OPT_fxor_operator,
+                            options::OPT_fno_xor_operator,
+                            options::OPT_falternative_parameter_statement,
+                            options::OPT_fdefault_real_8,
+                            options::OPT_fdefault_integer_8,
+                            options::OPT_fdefault_double_8,
+                            options::OPT_flarge_sizes,
+                            options::OPT_fno_automatic,
+                            options::OPT_fhermetic_module_files});
 }
 
 void Flang::addPreprocessingOptions(const ArgList &Args,
diff --git a/flang/include/flang/Frontend/CompilerInvocation.h b/flang/include/flang/Frontend/CompilerInvocation.h
index 0fefaecfe4f06..d1646f585cf85 100644
--- a/flang/include/flang/Frontend/CompilerInvocation.h
+++ b/flang/include/flang/Frontend/CompilerInvocation.h
@@ -99,6 +99,7 @@ class CompilerInvocation : public CompilerInvocationBase {
   std::string moduleFileSuffix = ".mod";
 
   bool debugModuleDir = false;
+  bool hermeticModuleFileOutput = false;
 
   bool warnAsErr = false;
 
@@ -179,6 +180,11 @@ class CompilerInvocation : public CompilerInvocationBase {
   bool &getDebugModuleDir() { return debugModuleDir; }
   const bool &getDebugModuleDir() const { return debugModuleDir; }
 
+  bool &getHermeticModuleFileOutput() { return hermeticModuleFileOutput; }
+  const bool &getHermeticModuleFileOutput() const {
+    return hermeticModuleFileOutput;
+  }
+
   bool &getWarnAsErr() { return warnAsErr; }
   const bool &getWarnAsErr() const { return warnAsErr; }
 
@@ -244,6 +250,9 @@ class CompilerInvocation : public CompilerInvocationBase {
   }
 
   void setDebugModuleDir(bool flag) { debugModuleDir = flag; }
+  void setHermeticModuleFileOutput(bool flag) {
+    hermeticModuleFileOutput = flag;
+  }
 
   void setWarnAsErr(bool flag) { warnAsErr = flag; }
 
diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h
index d382663762bc3..3ee71fe485524 100644
--- a/flang/include/flang/Semantics/semantics.h
+++ b/flang/include/flang/Semantics/semantics.h
@@ -307,10 +307,11 @@ class SemanticsContext {
 
 class Semantics {
 public:
-  explicit Semantics(SemanticsContext &context, parser::Program &program,
-      bool debugModuleWriter = false)
-      : context_{context}, program_{program} {
-    context.set_debugModuleWriter(debugModuleWriter);
+  explicit Semantics(SemanticsContext &context, parser::Program &program)
+      : context_{context}, program_{program} {}
+  Semantics &set_hermeticModuleFileOutput(bool yes = true) {
+    hermeticModuleFileOutput_ = yes;
+    return *this;
   }
 
   SemanticsContext &context() const { return context_; }
@@ -326,6 +327,7 @@ class Semantics {
 private:
   SemanticsContext &context_;
   parser::Program &program_;
+  bool hermeticModuleFileOutput_{false};
 };
 
 // Base class for semantics checkers.
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index f96d72f1ad691..8a3a9f895aca6 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -809,6 +809,11 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args,
     res.setDebugModuleDir(true);
   }
 
+  // -fhermetic-module-files option
+  if (args.hasArg(clang::driver::options::OPT_fhermetic_module_files)) {
+    res.setHermeticModuleFileOutput(true);
+  }
+
   // -module-suffix
   if (const auto *moduleSuffix =
           args.getLastArg(clang::driver::options::OPT_module_suffix)) {
diff --git a/flang/lib/Frontend/FrontendAction.cpp b/flang/lib/Frontend/FrontendAction.cpp
index 765e203ddfbe5..42a614fe46be5 100644
--- a/flang/lib/Frontend/FrontendAction.cpp
+++ b/flang/lib/Frontend/FrontendAction.cpp
@@ -180,11 +180,14 @@ bool FrontendAction::runSemanticChecks() {
   // so that they are merged and all printed in order.
   auto &semanticsCtx{ci.getSemanticsContext()};
   semanticsCtx.messages().Annex(std::move(ci.getParsing().messages()));
+  semanticsCtx.set_debugModuleWriter(ci.getInvocation().getDebugModuleDir());
 
   // Prepare semantics
-  ci.setSemantics(std::make_unique<Fortran::semantics::Semantics>(
-      semanticsCtx, *parseTree, ci.getInvocation().getDebugModuleDir()));
+  ci.setSemantics(std::make_unique<Fortran::semantics::Semantics>(semanticsCtx,
+                                                                  *parseTree));
   auto &semantics = ci.getSemantics();
+  semantics.set_hermeticModuleFileOutput(
+      ci.getInvocation().getHermeticModuleFileOutput());
 
   // Run semantic checks
   semantics.Perform();
diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp
index d7f149467dd7a..a1c4c031ca509 100644
--- a/flang/lib/Semantics/mod-file.cpp
+++ b/flang/lib/Semantics/mod-file.cpp
@@ -141,10 +141,26 @@ void ModFileWriter::Write(const Symbol &symbol) {
   auto ancestorName{ancestor ? ancestor->GetName().value().ToString() : ""s};
   auto path{context_.moduleDirectory() + '/' +
       ModFileName(symbol.name(), ancestorName, context_.moduleFileSuffix())};
-  PutSymbols(DEREF(symbol.scope()));
+
+  UnorderedSymbolSet hermeticModules;
+  hermeticModules.insert(symbol);
+  UnorderedSymbolSet additionalModules;
+  PutSymbols(DEREF(symbol.scope()),
+      hermeticModuleFileOutput_ ? &additionalModules : nullptr);
+  auto asStr{GetAsString(symbol)};
+  while (!additionalModules.empty()) {
+    for (auto ref : UnorderedSymbolSet{std::move(additionalModules)}) {
+      if (hermeticModules.insert(*ref).second &&
+          !ref->owner().IsIntrinsicModules()) {
+        PutSymbols(DEREF(ref->scope()), &additionalModules);
+        asStr += GetAsString(*ref);
+      }
+    }
+  }
+
   ModuleCheckSumType checkSum;
-  if (std::error_code error{WriteFile(
-          path, GetAsString(symbol), checkSum, context_.debugModuleWriter())}) {
+  if (std::error_code error{
+          WriteFile(path, asStr, checkSum, context_.debugModuleWriter())}) {
     context_.Say(
         symbol.name(), "Error writing %s: %s"_err_en_US, path, error.message());
   }
@@ -157,7 +173,7 @@ void ModFileWriter::WriteClosure(llvm::raw_ostream &out, const Symbol &symbol,
       !nonIntrinsicModulesWritten.insert(symbol).second) {
     return;
   }
-  PutSymbols(DEREF(symbol.scope()));
+  PutSymbols(DEREF(symbol.scope()), /*hermeticModules=*/nullptr);
   needsBuf_.clear(); // omit module checksums
   auto str{GetAsString(symbol)};
   for (auto depRef : std::move(usedNonIntrinsicModules_)) {
@@ -338,7 +354,8 @@ void ModFileWriter::PrepareRenamings(const Scope &scope) {
 }
 
 // Put out the visible symbols from scope.
-void ModFileWriter::PutSymbols(const Scope &scope) {
+void ModFileWriter::PutSymbols(
+    const Scope &scope, UnorderedSymbolSet *hermeticModules) {
   SymbolVector sorted;
   SymbolVector uses;
   auto &renamings{context_.moduleFileOutputRenamings()};
@@ -349,11 +366,16 @@ void ModFileWriter::PutSymbols(const Scope &scope) {
   // Write module files for dependencies first so that their
   // hashes are known.
   for (auto ref : modules) {
-    Write(*ref);
-    needs_ << ModHeader::need
-           << CheckSumString(ref->get<ModuleDetails>().moduleFileHash().value())
-           << (ref->owner().IsIntrinsicModules() ? " i " : " n ")
-           << ref->name().ToString() << '\n';
+    if (hermeticModules) {
+      hermeticModules->insert(*ref);
+    } else {
+      Write(*ref);
+      needs_ << ModHeader::need
+             << CheckSumString(
+                    ref->get<ModuleDetails>().moduleFileHash().value())
+             << (ref->owner().IsIntrinsicModules() ? " i " : " n ")
+             << ref->name().ToString() << '\n';
+    }
   }
   std::string buf; // stuff after CONTAINS in derived type
   llvm::raw_string_ostream typeBindings{buf};
diff --git a/flang/lib/Semantics/mod-file.h b/flang/lib/Semantics/mod-file.h
index be44780bef438..49bcef0c923fb 100644
--- a/flang/lib/Semantics/mod-file.h
+++ b/flang/lib/Semantics/mod-file.h
@@ -37,6 +37,10 @@ class ModFileWriter {
   bool WriteAll();
   void WriteClosure(llvm::raw_ostream &, const Symbol &,
       UnorderedSymbolSet &nonIntrinsicModulesWritten);
+  ModFileWriter &set_hermeticModuleFileOutput(bool yes = true) {
+    hermeticModuleFileOutput_ = yes;
+    return *this;
+  }
 
 private:
   SemanticsContext &context_;
@@ -57,13 +61,14 @@ class ModFileWriter {
   llvm::raw_string_ostream decls_{declsBuf_};
   llvm::raw_string_ostream contains_{containsBuf_};
   bool isSubmodule_{false};
+  bool hermeticModuleFileOutput_{false};
 
   void WriteAll(const Scope &);
   void WriteOne(const Scope &);
   void Write(const Symbol &);
   std::string GetAsString(const Symbol &);
   void PrepareRenamings(const Scope &);
-  void PutSymbols(const Scope &);
+  void PutSymbols(const Scope &, UnorderedSymbolSet *hermetic);
   // Returns true if a derived type with bindings and "contains" was emitted
   bool PutComponents(const Symbol &);
   void PutSymbol(llvm::raw_ostream &, const Symbol &);
diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp
index 1bb0679b75110..c09734e5676f3 100644
--- a/flang/lib/Semantics/semantics.cpp
+++ b/flang/lib/Semantics/semantics.cpp
@@ -602,7 +602,9 @@ bool Semantics::Perform() {
       CanonicalizeCUDA(program_) &&
       CanonicalizeDirectives(context_.messages(), program_) &&
       PerformStatementSemantics(context_, program_) &&
-      ModFileWriter{context_}.WriteAll();
+      ModFileWriter{context_}
+          .set_hermeticModuleFileOutput(hermeticModuleFileOutput_)
+          .WriteAll();
 }
 
 void Semantics::EmitMessages(llvm::raw_ostream &os) {
diff --git a/flang/test/Semantics/modfile65.f90 b/flang/test/Semantics/modfile65.f90
new file mode 100644
index 0000000000000..249255e02129f
--- /dev/null
+++ b/flang/test/Semantics/modfile65.f90
@@ -0,0 +1,53 @@
+! RUN: %python %S/test_modfile.py %s %flang_fc1 -fhermetic-module-files
+module m1
+  integer, parameter :: n = 123
+end
+
+module m2
+  use m1
+end
+
+module m3
+  use m1, m => n
+end
+
+module m4
+  use m2
+  use m3
+end
+
+!Expect: m1.mod
+!module m1
+!integer(4),parameter::n=123_4
+!end
+
+!Expect: m2.mod
+!module m2
+!use m1,only:n
+!end
+!module m1
+!integer(4),parameter::n=123_4
+!end
+
+!Expect: m3.mod
+!module m3
+!use m1,only:m=>n
+!end
+!module m1
+!integer(4),parameter::n=123_4
+!end
+
+!Expect: m4.mod
+!module m4
+!use m2,only:n
+!use m3,only:m
+!end
+!module m2
+!use m1,only:n
+!end
+!module m3
+!use m1,only:m=>n
+!end
+!module m1
+!integer(4),parameter::n=123_4
+!end

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I guess in most use case only one module file emitted in the compilation unit is probably meant to be delivered (while this patch will emit all module file of this compilation unit as hermetic), but most app build only define one module per compilation unit anyway, so they can set the new -fhermetic-module option only for the module file being delivered to avoid heavier builds of every intermediate internal module. All that to say, I like the simplicity of your solution.

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver logic makes sense, thank you! LGTM

@klausler klausler merged commit 6598795 into llvm:main Jul 11, 2024
13 checks passed
@klausler klausler deleted the modfiles branch July 11, 2024 21:02
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
Module files emitted by this Fortran compiler are valid Fortran source
files. Symbols that are USE-associated into modules are represented in
their module files with USE statements and special comments with hash
codes in them to ensure that those USE statements resolve to the same
modules that were used to build the module when its module file was
generated.

This scheme prevents unchecked module file growth in large applications
by not emitting USE-associated symbols redundantly. This problem can be
especially bad when derived type definitions must be repeated in the
module files of their clients, and the clients of those modules, and so
on. However, this scheme has the disadvantage that clients of modules
must be compiled with dependent modules in the module search path.

This new -fhermetic-module-files option causes module file output to be
free of dependences on any non-intrinsic module files; dependent modules
are instead emitted as part of the module file, rather than being
USE-associated. It is intended for top level library module files that
are shipped with binary libraries when it is not convenient to collect
and ship their dependent module files as well.

Fixes llvm#97398.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:driver flang:semantics flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flang-new trying to access the imported module inside the .mod file
4 participants