[MLIR] emitc: Add emitc.file op #123298

mgehre-amd · 2025-01-17T07:57:24Z

A emitc.file represents a file that can be emitted
into a single C++ file.

This allows to manage multiple source files within the same MLIR module,
but emit them into separate files.

This feature is opt-in.
By default, mlir-translate emits all ops outside of emitc.file
and ignores all emitc.file ops and their bodies.

When specifying the -file-id=id flag,
mlir-translate emits all ops outside of emitc.file and
the ops within the emitc.file with matching id.

Example:

emitc.file "main" {
  func @func_one() {
    return
  }
}
emitc.file "test" {
  func @func_two() {
   return
  }
}

mlir-translate -file-id=main will emit func_one and
mlir-translate -file-id=test will emit func_two.

llvmbot · 2025-01-17T07:57:56Z

@llvm/pr-subscribers-mlir-emitc

Author: Matthias Gehre (mgehre-amd)

Changes

A emitc.tu represents a translation unit that can be emitted into a single C++ file.

This allows to manage multiple translation units within the same MLIR module, but emit them into separate files.
This feature is opt-in; by default,
mlir-translate will continue to emit the whole module.

When specifying the -translation-unit-id=id flag, mlir-translate emits only the translation unit with that id.

Example:

emitc.tu "main" {
  func @<!-- -->func_one() {
    return
  }
}
emitc.tu "test" {
  func @<!-- -->func_two() {
   return
  }
}

mlir-translate -translation-unit-id=main will emit func_one and mlir-translate -translation-unit-id=test will emit func_two.

Full diff: https://github.com/llvm/llvm-project/pull/123298.diff

6 Files Affected:

(modified) mlir/include/mlir/Dialect/EmitC/IR/EmitC.td (+49)
(modified) mlir/include/mlir/Target/Cpp/CppEmitter.h (+3-1)
(modified) mlir/lib/Dialect/EmitC/IR/EmitC.cpp (+10)
(modified) mlir/lib/Target/Cpp/TranslateRegistration.cpp (+7-1)
(modified) mlir/lib/Target/Cpp/TranslateToCpp.cpp (+31-7)
(added) mlir/test/Target/Cpp/tu.mlir (+29)

diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
index b16f5a8619fe7b..1fe4e34b3fa5ab 100644
--- a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
+++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
@@ -23,6 +23,7 @@ include "mlir/Interfaces/FunctionInterfaces.td"
 include "mlir/Interfaces/SideEffectInterfaces.td"
 include "mlir/IR/OpAsmInterface.td"
 include "mlir/IR/RegionKindInterface.td"
+include "mlir/IR/BuiltinAttributes.td"
 
 //===----------------------------------------------------------------------===//
 // EmitC op definitions
@@ -56,6 +57,54 @@ def IntegerIndexOrOpaqueType : Type<CPred<"emitc::isIntegerIndexOrOpaqueType($_s
 "integer, index or opaque type supported by EmitC">;
 def FloatIntegerIndexOrOpaqueType : AnyTypeOf<[EmitCFloatType, IntegerIndexOrOpaqueType]>;
 
+def EmitC_TranslationUnitOp
+    : EmitC_Op<"tu", [IsolatedFromAbove, NoRegionArguments, SymbolTable,
+                      OpAsmOpInterface]#GraphRegionNoTerminator.traits> {
+  let summary = "A translation unit container operation";
+  let description = [{
+    A `tu` represents a translation unit that can be emitted
+    into a single C++ file.
+
+    `mlir-translate` emits only the translation unit selected via
+    the `-translation-unit-id=id` flag. By default, no translation units are
+    emitted.
+
+    Example:
+
+    ```mlir
+    emitc.tu "main" {
+      emitc.func @func_one() {
+        emitc.return
+      }
+    }
+    ```
+  }];
+
+  let arguments = (ins Builtin_StringAttr:$id);
+  let regions = (region SizedRegion<1>:$bodyRegion);
+
+  let assemblyFormat = "$id attr-dict-with-keyword $bodyRegion";
+  let builders = [OpBuilder<(ins CArg<"StringRef">:$id)>];
+  let extraClassDeclaration = [{
+    /// Construct a module from the given location with an optional name.
+    static TranslationUnitOp create(Location loc, StringRef name);
+
+    //===------------------------------------------------------------------===//
+    // OpAsmOpInterface Methods
+    //===------------------------------------------------------------------===//
+
+    /// EmitC ops in the body of the translation_unit can omit their 'emitc.'
+    /// prefix in the assembly.
+    static ::llvm::StringRef getDefaultDialect() {
+      return "emitc";
+    }
+  }];
+
+  // We need to ensure that the body region has a block;
+  // the auto-generated builders do not guarantee that.
+  let skipDefaultBuilders = 1;
+}
+
 def EmitC_AddOp : EmitC_BinaryOp<"add", [CExpression]> {
   let summary = "Addition operation";
   let description = [{
diff --git a/mlir/include/mlir/Target/Cpp/CppEmitter.h b/mlir/include/mlir/Target/Cpp/CppEmitter.h
index 99d8696cc8e077..d76cfc9107332e 100644
--- a/mlir/include/mlir/Target/Cpp/CppEmitter.h
+++ b/mlir/include/mlir/Target/Cpp/CppEmitter.h
@@ -14,6 +14,7 @@
 #define MLIR_TARGET_CPP_CPPEMITTER_H
 
 #include "mlir/Support/LLVM.h"
+#include "llvm/ADT/StringRef.h"
 
 namespace mlir {
 class Operation;
@@ -24,7 +25,8 @@ namespace emitc {
 /// 'declareVariablesAtTop' enforces that all variables for op results and block
 /// arguments are declared at the beginning of the function.
 LogicalResult translateToCpp(Operation *op, raw_ostream &os,
-                             bool declareVariablesAtTop = false);
+                             bool declareVariablesAtTop = false,
+                             StringRef onlyTu = "");
 } // namespace emitc
 } // namespace mlir
 
diff --git a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
index fdc21d6c6e24b9..b51221b721dde3 100644
--- a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+++ b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
@@ -1289,6 +1289,16 @@ void SwitchOp::getRegionInvocationBounds(
     bounds.emplace_back(/*lb=*/0, /*ub=*/regIndex == liveIndex);
 }
 
+//===----------------------------------------------------------------------===//
+// TranslationUnitOp
+//===----------------------------------------------------------------------===//
+void TranslationUnitOp::build(OpBuilder &builder, OperationState &state,
+                              StringRef id) {
+  state.addRegion()->emplaceBlock();
+  state.attributes.push_back(
+      builder.getNamedAttr("id", builder.getStringAttr(id)));
+}
+
 //===----------------------------------------------------------------------===//
 // TableGen'd op method definitions
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Target/Cpp/TranslateRegistration.cpp b/mlir/lib/Target/Cpp/TranslateRegistration.cpp
index 1aa98834a73f49..7e2bc9ad012b38 100644
--- a/mlir/lib/Target/Cpp/TranslateRegistration.cpp
+++ b/mlir/lib/Target/Cpp/TranslateRegistration.cpp
@@ -29,12 +29,18 @@ void registerToCppTranslation() {
       llvm::cl::desc("Declare variables at top when emitting C/C++"),
       llvm::cl::init(false));
 
+  static llvm::cl::opt<std::string> onlyTu(
+      "translation-unit-id",
+      llvm::cl::desc("Only emit the translation unit with the matching id"),
+      llvm::cl::init(""));
+
   TranslateFromMLIRRegistration reg(
       "mlir-to-cpp", "translate from mlir to cpp",
       [](Operation *op, raw_ostream &output) {
         return emitc::translateToCpp(
             op, output,
-            /*declareVariablesAtTop=*/declareVariablesAtTop);
+            /*declareVariablesAtTop=*/declareVariablesAtTop,
+            /*onlyTu=*/onlyTu);
       },
       [](DialectRegistry &registry) {
         // clang-format off
diff --git a/mlir/lib/Target/Cpp/TranslateToCpp.cpp b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
index a91f5ab9311401..c9d66ad349db52 100644
--- a/mlir/lib/Target/Cpp/TranslateToCpp.cpp
+++ b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
@@ -114,7 +114,8 @@ static FailureOr<int> getOperatorPrecedence(Operation *operation) {
 namespace {
 /// Emitter that uses dialect specific emitters to emit C++ code.
 struct CppEmitter {
-  explicit CppEmitter(raw_ostream &os, bool declareVariablesAtTop);
+  explicit CppEmitter(raw_ostream &os, bool declareVariablesAtTop,
+                      StringRef onlyTu);
 
   /// Emits attribute or returns failure.
   LogicalResult emitAttribute(Location loc, Attribute attr);
@@ -231,6 +232,9 @@ struct CppEmitter {
   /// be declared at the beginning of a function.
   bool shouldDeclareVariablesAtTop() { return declareVariablesAtTop; };
 
+  /// Returns whether this translation unit should be emitted
+  bool shouldEmitTu(TranslationUnitOp tu) { return tu.getId() == onlyTu; }
+
   /// Get expression currently being emitted.
   ExpressionOp getEmittedExpression() { return emittedExpression; }
 
@@ -258,6 +262,9 @@ struct CppEmitter {
   /// includes results from ops located in nested regions.
   bool declareVariablesAtTop;
 
+  /// Only emit translation units whos id matches this value.
+  std::string onlyTu;
+
   /// Map from value to name of C++ variable that contain the name.
   ValueMapper valueMapper;
 
@@ -960,6 +967,19 @@ static LogicalResult printOperation(CppEmitter &emitter, ModuleOp moduleOp) {
   return success();
 }
 
+static LogicalResult printOperation(CppEmitter &emitter, TranslationUnitOp tu) {
+  if (!emitter.shouldEmitTu(tu))
+    return success();
+
+  CppEmitter::Scope scope(emitter);
+
+  for (Operation &op : tu) {
+    if (failed(emitter.emitOperation(op, /*trailingSemicolon=*/false)))
+      return failure();
+  }
+  return success();
+}
+
 static LogicalResult printFunctionArgs(CppEmitter &emitter,
                                        Operation *functionOp,
                                        ArrayRef<Type> arguments) {
@@ -1159,8 +1179,10 @@ static LogicalResult printOperation(CppEmitter &emitter,
   return success();
 }
 
-CppEmitter::CppEmitter(raw_ostream &os, bool declareVariablesAtTop)
-    : os(os), declareVariablesAtTop(declareVariablesAtTop) {
+CppEmitter::CppEmitter(raw_ostream &os, bool declareVariablesAtTop,
+                       StringRef onlyTu)
+    : os(os), declareVariablesAtTop(declareVariablesAtTop),
+      onlyTu(onlyTu.str()) {
   valueInScopeCount.push(0);
   labelInScopeCount.push(0);
 }
@@ -1561,8 +1583,9 @@ LogicalResult CppEmitter::emitOperation(Operation &op, bool trailingSemicolon) {
                 emitc::GlobalOp, emitc::IfOp, emitc::IncludeOp, emitc::LoadOp,
                 emitc::LogicalAndOp, emitc::LogicalNotOp, emitc::LogicalOrOp,
                 emitc::MulOp, emitc::RemOp, emitc::ReturnOp, emitc::SubOp,
-                emitc::SwitchOp, emitc::UnaryMinusOp, emitc::UnaryPlusOp,
-                emitc::VariableOp, emitc::VerbatimOp>(
+                emitc::SwitchOp, emitc::TranslationUnitOp, emitc::UnaryMinusOp,
+                emitc::UnaryPlusOp, emitc::VariableOp, emitc::VerbatimOp>(
+
               [&](auto op) { return printOperation(*this, op); })
           // Func ops.
           .Case<func::CallOp, func::FuncOp, func::ReturnOp>(
@@ -1742,7 +1765,8 @@ LogicalResult CppEmitter::emitTupleType(Location loc, ArrayRef<Type> types) {
 }
 
 LogicalResult emitc::translateToCpp(Operation *op, raw_ostream &os,
-                                    bool declareVariablesAtTop) {
-  CppEmitter emitter(os, declareVariablesAtTop);
+                                    bool declareVariablesAtTop,
+                                    StringRef onlyTu) {
+  CppEmitter emitter(os, declareVariablesAtTop, onlyTu);
   return emitter.emitOperation(*op, /*trailingSemicolon=*/false);
 }
diff --git a/mlir/test/Target/Cpp/tu.mlir b/mlir/test/Target/Cpp/tu.mlir
new file mode 100644
index 00000000000000..ca10e0263a64fc
--- /dev/null
+++ b/mlir/test/Target/Cpp/tu.mlir
@@ -0,0 +1,29 @@
+// RUN: mlir-translate -mlir-to-cpp %s | FileCheck %s --check-prefix NO-FILTER
+// RUN: mlir-translate -mlir-to-cpp -translation-unit-id=non-existing %s | FileCheck %s --check-prefix NON-EXISTING
+// RUN: mlir-translate -mlir-to-cpp -translation-unit-id=tu_one %s | FileCheck %s --check-prefix TU-ONE
+// RUN: mlir-translate -mlir-to-cpp -translation-unit-id=tu_two %s | FileCheck %s --check-prefix TU-TWO
+
+
+// NO-FILTER-NOT: func_one
+// NO-FILTER-NOT: func_two
+
+// NON-EXISTING-NOT: func_one
+// NON-EXISTING-NOT: func_two
+
+// TU-ONE: func_one
+// TU-ONE-NOT: func_two
+
+// TU-TWO-NOT: func_one
+// TU-TWO: func_two
+
+emitc.tu "tu_one" {
+  emitc.func @func_one(%arg: f32) {
+    emitc.return
+  }
+}
+
+emitc.tu "tu_two" {
+  emitc.func @func_two(%arg: f32) {
+    emitc.return
+  }
+}

llvmbot · 2025-01-17T07:57:56Z

@llvm/pr-subscribers-mlir

Author: Matthias Gehre (mgehre-amd)

Changes

A emitc.tu represents a translation unit that can be emitted into a single C++ file.

This allows to manage multiple translation units within the same MLIR module, but emit them into separate files.
This feature is opt-in; by default,
mlir-translate will continue to emit the whole module.

When specifying the -translation-unit-id=id flag, mlir-translate emits only the translation unit with that id.

Example:

emitc.tu "main" {
  func @<!-- -->func_one() {
    return
  }
}
emitc.tu "test" {
  func @<!-- -->func_two() {
   return
  }
}

mlir-translate -translation-unit-id=main will emit func_one and mlir-translate -translation-unit-id=test will emit func_two.

Full diff: https://github.com/llvm/llvm-project/pull/123298.diff

6 Files Affected:

(modified) mlir/include/mlir/Dialect/EmitC/IR/EmitC.td (+49)
(modified) mlir/include/mlir/Target/Cpp/CppEmitter.h (+3-1)
(modified) mlir/lib/Dialect/EmitC/IR/EmitC.cpp (+10)
(modified) mlir/lib/Target/Cpp/TranslateRegistration.cpp (+7-1)
(modified) mlir/lib/Target/Cpp/TranslateToCpp.cpp (+31-7)
(added) mlir/test/Target/Cpp/tu.mlir (+29)

diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
index b16f5a8619fe7b..1fe4e34b3fa5ab 100644
--- a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
+++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
@@ -23,6 +23,7 @@ include "mlir/Interfaces/FunctionInterfaces.td"
 include "mlir/Interfaces/SideEffectInterfaces.td"
 include "mlir/IR/OpAsmInterface.td"
 include "mlir/IR/RegionKindInterface.td"
+include "mlir/IR/BuiltinAttributes.td"
 
 //===----------------------------------------------------------------------===//
 // EmitC op definitions
@@ -56,6 +57,54 @@ def IntegerIndexOrOpaqueType : Type<CPred<"emitc::isIntegerIndexOrOpaqueType($_s
 "integer, index or opaque type supported by EmitC">;
 def FloatIntegerIndexOrOpaqueType : AnyTypeOf<[EmitCFloatType, IntegerIndexOrOpaqueType]>;
 
+def EmitC_TranslationUnitOp
+    : EmitC_Op<"tu", [IsolatedFromAbove, NoRegionArguments, SymbolTable,
+                      OpAsmOpInterface]#GraphRegionNoTerminator.traits> {
+  let summary = "A translation unit container operation";
+  let description = [{
+    A `tu` represents a translation unit that can be emitted
+    into a single C++ file.
+
+    `mlir-translate` emits only the translation unit selected via
+    the `-translation-unit-id=id` flag. By default, no translation units are
+    emitted.
+
+    Example:
+
+    ```mlir
+    emitc.tu "main" {
+      emitc.func @func_one() {
+        emitc.return
+      }
+    }
+    ```
+  }];
+
+  let arguments = (ins Builtin_StringAttr:$id);
+  let regions = (region SizedRegion<1>:$bodyRegion);
+
+  let assemblyFormat = "$id attr-dict-with-keyword $bodyRegion";
+  let builders = [OpBuilder<(ins CArg<"StringRef">:$id)>];
+  let extraClassDeclaration = [{
+    /// Construct a module from the given location with an optional name.
+    static TranslationUnitOp create(Location loc, StringRef name);
+
+    //===------------------------------------------------------------------===//
+    // OpAsmOpInterface Methods
+    //===------------------------------------------------------------------===//
+
+    /// EmitC ops in the body of the translation_unit can omit their 'emitc.'
+    /// prefix in the assembly.
+    static ::llvm::StringRef getDefaultDialect() {
+      return "emitc";
+    }
+  }];
+
+  // We need to ensure that the body region has a block;
+  // the auto-generated builders do not guarantee that.
+  let skipDefaultBuilders = 1;
+}
+
 def EmitC_AddOp : EmitC_BinaryOp<"add", [CExpression]> {
   let summary = "Addition operation";
   let description = [{
diff --git a/mlir/include/mlir/Target/Cpp/CppEmitter.h b/mlir/include/mlir/Target/Cpp/CppEmitter.h
index 99d8696cc8e077..d76cfc9107332e 100644
--- a/mlir/include/mlir/Target/Cpp/CppEmitter.h
+++ b/mlir/include/mlir/Target/Cpp/CppEmitter.h
@@ -14,6 +14,7 @@
 #define MLIR_TARGET_CPP_CPPEMITTER_H
 
 #include "mlir/Support/LLVM.h"
+#include "llvm/ADT/StringRef.h"
 
 namespace mlir {
 class Operation;
@@ -24,7 +25,8 @@ namespace emitc {
 /// 'declareVariablesAtTop' enforces that all variables for op results and block
 /// arguments are declared at the beginning of the function.
 LogicalResult translateToCpp(Operation *op, raw_ostream &os,
-                             bool declareVariablesAtTop = false);
+                             bool declareVariablesAtTop = false,
+                             StringRef onlyTu = "");
 } // namespace emitc
 } // namespace mlir
 
diff --git a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
index fdc21d6c6e24b9..b51221b721dde3 100644
--- a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+++ b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
@@ -1289,6 +1289,16 @@ void SwitchOp::getRegionInvocationBounds(
     bounds.emplace_back(/*lb=*/0, /*ub=*/regIndex == liveIndex);
 }
 
+//===----------------------------------------------------------------------===//
+// TranslationUnitOp
+//===----------------------------------------------------------------------===//
+void TranslationUnitOp::build(OpBuilder &builder, OperationState &state,
+                              StringRef id) {
+  state.addRegion()->emplaceBlock();
+  state.attributes.push_back(
+      builder.getNamedAttr("id", builder.getStringAttr(id)));
+}
+
 //===----------------------------------------------------------------------===//
 // TableGen'd op method definitions
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Target/Cpp/TranslateRegistration.cpp b/mlir/lib/Target/Cpp/TranslateRegistration.cpp
index 1aa98834a73f49..7e2bc9ad012b38 100644
--- a/mlir/lib/Target/Cpp/TranslateRegistration.cpp
+++ b/mlir/lib/Target/Cpp/TranslateRegistration.cpp
@@ -29,12 +29,18 @@ void registerToCppTranslation() {
       llvm::cl::desc("Declare variables at top when emitting C/C++"),
       llvm::cl::init(false));
 
+  static llvm::cl::opt<std::string> onlyTu(
+      "translation-unit-id",
+      llvm::cl::desc("Only emit the translation unit with the matching id"),
+      llvm::cl::init(""));
+
   TranslateFromMLIRRegistration reg(
       "mlir-to-cpp", "translate from mlir to cpp",
       [](Operation *op, raw_ostream &output) {
         return emitc::translateToCpp(
             op, output,
-            /*declareVariablesAtTop=*/declareVariablesAtTop);
+            /*declareVariablesAtTop=*/declareVariablesAtTop,
+            /*onlyTu=*/onlyTu);
       },
       [](DialectRegistry &registry) {
         // clang-format off
diff --git a/mlir/lib/Target/Cpp/TranslateToCpp.cpp b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
index a91f5ab9311401..c9d66ad349db52 100644
--- a/mlir/lib/Target/Cpp/TranslateToCpp.cpp
+++ b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
@@ -114,7 +114,8 @@ static FailureOr<int> getOperatorPrecedence(Operation *operation) {
 namespace {
 /// Emitter that uses dialect specific emitters to emit C++ code.
 struct CppEmitter {
-  explicit CppEmitter(raw_ostream &os, bool declareVariablesAtTop);
+  explicit CppEmitter(raw_ostream &os, bool declareVariablesAtTop,
+                      StringRef onlyTu);
 
   /// Emits attribute or returns failure.
   LogicalResult emitAttribute(Location loc, Attribute attr);
@@ -231,6 +232,9 @@ struct CppEmitter {
   /// be declared at the beginning of a function.
   bool shouldDeclareVariablesAtTop() { return declareVariablesAtTop; };
 
+  /// Returns whether this translation unit should be emitted
+  bool shouldEmitTu(TranslationUnitOp tu) { return tu.getId() == onlyTu; }
+
   /// Get expression currently being emitted.
   ExpressionOp getEmittedExpression() { return emittedExpression; }
 
@@ -258,6 +262,9 @@ struct CppEmitter {
   /// includes results from ops located in nested regions.
   bool declareVariablesAtTop;
 
+  /// Only emit translation units whos id matches this value.
+  std::string onlyTu;
+
   /// Map from value to name of C++ variable that contain the name.
   ValueMapper valueMapper;
 
@@ -960,6 +967,19 @@ static LogicalResult printOperation(CppEmitter &emitter, ModuleOp moduleOp) {
   return success();
 }
 
+static LogicalResult printOperation(CppEmitter &emitter, TranslationUnitOp tu) {
+  if (!emitter.shouldEmitTu(tu))
+    return success();
+
+  CppEmitter::Scope scope(emitter);
+
+  for (Operation &op : tu) {
+    if (failed(emitter.emitOperation(op, /*trailingSemicolon=*/false)))
+      return failure();
+  }
+  return success();
+}
+
 static LogicalResult printFunctionArgs(CppEmitter &emitter,
                                        Operation *functionOp,
                                        ArrayRef<Type> arguments) {
@@ -1159,8 +1179,10 @@ static LogicalResult printOperation(CppEmitter &emitter,
   return success();
 }
 
-CppEmitter::CppEmitter(raw_ostream &os, bool declareVariablesAtTop)
-    : os(os), declareVariablesAtTop(declareVariablesAtTop) {
+CppEmitter::CppEmitter(raw_ostream &os, bool declareVariablesAtTop,
+                       StringRef onlyTu)
+    : os(os), declareVariablesAtTop(declareVariablesAtTop),
+      onlyTu(onlyTu.str()) {
   valueInScopeCount.push(0);
   labelInScopeCount.push(0);
 }
@@ -1561,8 +1583,9 @@ LogicalResult CppEmitter::emitOperation(Operation &op, bool trailingSemicolon) {
                 emitc::GlobalOp, emitc::IfOp, emitc::IncludeOp, emitc::LoadOp,
                 emitc::LogicalAndOp, emitc::LogicalNotOp, emitc::LogicalOrOp,
                 emitc::MulOp, emitc::RemOp, emitc::ReturnOp, emitc::SubOp,
-                emitc::SwitchOp, emitc::UnaryMinusOp, emitc::UnaryPlusOp,
-                emitc::VariableOp, emitc::VerbatimOp>(
+                emitc::SwitchOp, emitc::TranslationUnitOp, emitc::UnaryMinusOp,
+                emitc::UnaryPlusOp, emitc::VariableOp, emitc::VerbatimOp>(
+
               [&](auto op) { return printOperation(*this, op); })
           // Func ops.
           .Case<func::CallOp, func::FuncOp, func::ReturnOp>(
@@ -1742,7 +1765,8 @@ LogicalResult CppEmitter::emitTupleType(Location loc, ArrayRef<Type> types) {
 }
 
 LogicalResult emitc::translateToCpp(Operation *op, raw_ostream &os,
-                                    bool declareVariablesAtTop) {
-  CppEmitter emitter(os, declareVariablesAtTop);
+                                    bool declareVariablesAtTop,
+                                    StringRef onlyTu) {
+  CppEmitter emitter(os, declareVariablesAtTop, onlyTu);
   return emitter.emitOperation(*op, /*trailingSemicolon=*/false);
 }
diff --git a/mlir/test/Target/Cpp/tu.mlir b/mlir/test/Target/Cpp/tu.mlir
new file mode 100644
index 00000000000000..ca10e0263a64fc
--- /dev/null
+++ b/mlir/test/Target/Cpp/tu.mlir
@@ -0,0 +1,29 @@
+// RUN: mlir-translate -mlir-to-cpp %s | FileCheck %s --check-prefix NO-FILTER
+// RUN: mlir-translate -mlir-to-cpp -translation-unit-id=non-existing %s | FileCheck %s --check-prefix NON-EXISTING
+// RUN: mlir-translate -mlir-to-cpp -translation-unit-id=tu_one %s | FileCheck %s --check-prefix TU-ONE
+// RUN: mlir-translate -mlir-to-cpp -translation-unit-id=tu_two %s | FileCheck %s --check-prefix TU-TWO
+
+
+// NO-FILTER-NOT: func_one
+// NO-FILTER-NOT: func_two
+
+// NON-EXISTING-NOT: func_one
+// NON-EXISTING-NOT: func_two
+
+// TU-ONE: func_one
+// TU-ONE-NOT: func_two
+
+// TU-TWO-NOT: func_one
+// TU-TWO: func_two
+
+emitc.tu "tu_one" {
+  emitc.func @func_one(%arg: f32) {
+    emitc.return
+  }
+}
+
+emitc.tu "tu_two" {
+  emitc.func @func_two(%arg: f32) {
+    emitc.return
+  }
+}

simon-camp

LGTM

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td

simon-camp · 2025-01-17T09:40:08Z

This feature is opt-in; by default,
mlir-translate will continue to emit the whole module.

Can you reformat this line, please; and maybe I'm reading this wrong, but I expected the code to emit all TUs if not filtered by the translation-unit-id flag. Should we reformulate to something like mlir-translate will continue to emit operations defined outside of tu operations?

marbre · 2025-01-17T10:10:20Z

Quick question before having a change to take a deeper look. Would it make sense to introduce an emitc.module instead of an emitc.tu? Only referring to the naming here.

mgehre-amd · 2025-01-17T10:13:24Z

Quick question before having a change to take a deeper look. Would it make sense to introduce an emitc.module instead of an emitc.tu? Only referring to the naming here.

I don't mind the naming. In practice, we use a structure like

module {
  emitc.tu {
   func ...
  }
  emitc.tu {
   func ...
  }
}

and I liked that the emitc.tu was visually different from the enclosing module. But both is fine for me.

mgehre-amd · 2025-01-17T10:13:42Z

This feature is opt-in; by default,
mlir-translate will continue to emit the whole module.

Can you reformat this line, please; and maybe I'm reading this wrong, but I expected the code to emit all TUs if not filtered by the translation-unit-id flag. Should we reformulate to something like mlir-translate will continue to emit operations defined outside of tu operations?

I have reformulated the description. Does that sound better?

aniragil · 2025-01-17T18:31:26Z

Quick question before having a change to take a deeper look. Would it make sense to introduce an emitc.module instead of an emitc.tu? Only referring to the naming here.

I don't mind the naming. In practice, we use a structure like
module {
  emitc.tu {
   func ...
  }
  emitc.tu {
   func ...
  }
}
and I liked that the emitc.tu was visually different from the enclosing module. But both is fine for me.

At least in C99, "translation unit" seems to refer to the output of the preprocessor, i.e. the source file after all preprocessing directives have been expanded. As emitc supports emitting preprocessing directives, notably #include statements, using this term might be misleading. The standard's term for the file keeping the user's source code seems to be "source file" / "preprocessing file", so emitc.source_file / emitc.source might be more accurate.

As C++20 defined a C++ module construct, emitc.module may be conflicting / misleading as well.

marbre · 2025-01-17T19:39:47Z

At least in C99, "translation unit" seems to refer to the output of the preprocessor, i.e. the source file after all preprocessing directives have been expanded. As emitc supports emitting preprocessing directives, notably #include statements, using this term might be misleading. The standard's term for the file keeping the user's source code seems to be "source file" / "preprocessing file", so emitc.source_file / emitc.source might be more accurate.

As C++20 defined a C++ module construct, emitc.module may be conflicting / misleading as well.

Thanks, I didn't had C++20 modules in mind at that moment, but let's avoid this therefore.

mgehre-amd · 2025-01-20T07:50:10Z

How about emitc.file? This would indicate clearly that the body of this op is intended to be generated into its own file, and it would stay at the syntactic level where most of emitc is.

aniragil · 2025-01-20T09:06:28Z

Thinking about the semantics of the proposed op, such a file-scope emitc op may be useful beyond the functionality introduced by this patch, e.g. for

(a) Specifying the C dialect the emitc code belongs to, e.g. in order for the translator to know how to emit vector types when targeting OpenCL, GCC vector extensions etc.

(b) For offload/launch semantics such as in gpu.launch_func/gpu.module, which could prove useful for OpenCL, OpenMP etc.

(c) Validation: Utilizing standard op validation to verify that all its internal ops are emitc and potentially that these ops and types are valid in its designated C-dialect (e.g. vector types, address spaces, templates).

Would be good to discuss this op's designated uses to make sure we don't introduce backward compatibility later.
WDYT?

marbre · 2025-01-21T16:53:56Z

Thinking about the semantics of the proposed op, such a file-scope emitc op may be useful beyond the functionality introduced by this patch, e.g. for

(a) Specifying the C dialect the emitc code belongs to, e.g. in order for the translator to know how to emit vector types when targeting OpenCL, GCC vector extensions etc.

I think this needs to be modeled in the dialect. With this you wouldn't pass a flag to the emitter but of course the emitter would need to know how to handle those ops with dialect information.

(b) For offload/launch semantics such as in gpu.launch_func/gpu.module, which could prove useful for OpenCL, OpenMP etc.

Not sure I see how this is related to a file-scope as proposed by this patch.

(c) Validation: Utilizing standard op validation to verify that all its internal ops are emitc and potentially that these ops and types are valid in its designated C-dialect (e.g. vector types, address spaces, templates).

Adding a validation pass is something I started some time ago but didn't had time to finish. So yes, +1 from me. However, does it influence changed introduced by this patch?

Would be good to discuss this op's designated uses to make sure we don't introduce backward compatibility later. WDYT?

Makes sense, yes.

marbre · 2025-01-22T11:43:40Z

How about emitc.file? This would indicate clearly that the body of this op is intended to be generated into its own file, and it would stay at the syntactic level where most of emitc is.

Thinking more about the naming I actually also ended several times with emitc.file:

emitc.source_file is rather lengthy
emitc.source works but here the "source" gets emitted thus file might be the better fit
emitc.unit might work as well and is closer to tu but I am not really in favor of this

Any further opinions or suggestions?

aniragil · 2025-01-23T11:19:59Z

Thinking about the semantics of the proposed op, such a file-scope emitc op may be useful beyond the functionality introduced by this patch, e.g. for
(a) Specifying the C dialect the emitc code belongs to, e.g. in order for the translator to know how to emit vector types when targeting OpenCL, GCC vector extensions etc.

I think this needs to be modeled in the dialect. With this you wouldn't pass a flag to the emitter but of course the emitter would need to know how to handle those ops with dialect information.

Exactly. It could also guide the lowering process, e.g. by scalarizing vectors for C, implementing tuples as std::tuple for C++ or as structs for C etc.

(b) For offload/launch semantics such as in gpu.launch_func/gpu.module, which could prove useful for OpenCL, OpenMP etc.

Not sure I see how this is related to a file-scope as proposed by this patch.

So in order to support gpu.launch_func, gpu.module is a symbol. We could add this later, but we might want to plan ahead. For instance, if we allow any string as the proposed id it might later be unusable as a valid symbol name. We could then add a second attribute for the symbol name or pose restrictions on the id attribute, with both options breaking backward compatibility of existing MLIR files.

Another aspect is the necessity of a dedicated emitc op. If its sole purpose is to provide a named scope, would the builtin module suffice (as IINM in the LLVM dialect)?

(c) Validation: Utilizing standard op validation to verify that all its internal ops are emitc and potentially that these ops and types are valid in its designated C-dialect (e.g. vector types, address spaces, templates).

Adding a validation pass is something I started some time ago but didn't had time to finish. So yes, +1 from me. However, does it influence changed introduced by this patch?

So I was thinking whether the op should have a ::verify() method. Unless the translator itself runs such a verifying pass (which I'm not sure translators are supposed to do) this pass would be optional, letting users call the translator on invalid files. If we implement ::verify() the op would have to be introduced after all lowering to emitc has been done. If we do this later on, we might be breaking downstream compilers which expected it to be usable earlier.

Would be good to discuss this op's designated uses to make sure we don't introduce backward compatibility later. WDYT?

Makes sense, yes.

Any other potential uses for this op?

mgehre-amd · 2025-01-23T14:08:22Z

So in order to support gpu.launch_func, gpu.module is a symbol. We could add this later, but we might want to plan ahead. >For instance, if we allow any string as the proposed id it might later be unusable as a valid symbol name. We could then add a second attribute for the symbol name or pose restrictions on the id attribute, with both options breaking backward compatibility of existing MLIR files.

Adding an additional attribute later would not break compatibility. And we don't know yet if we actually going to get this, right?

Another aspect is the necessity of a dedicated emitc op. If its sole purpose is to provide a named scope, would the builtin module suffice (as IINM in the LLVM dialect)?

I think that would work. It would limit us if we want to extend it in the future, e.g. by adding language standards to the scope.

(c) Validation: Utilizing standard op validation to verify that all its internal ops are emitc and potentially that these ops and types are valid in its designated C-dialect (e.g. vector types, address spaces, templates).

Adding a validation pass is something I started some time ago but didn't had time to finish. So yes, +1 from me. However, does it influence changed introduced by this patch?

So I was thinking whether the op should have a ::verify() method. Unless the translator itself runs such a verifying pass (which I'm not sure translators are supposed to do) this pass would be optional, letting users call the translator on invalid files. If we implement ::verify() the op would have to be introduced after all lowering to emitc has been done. If we do this later on, we might be breaking downstream compilers which expected it to be usable earlier.

I'm opposed to recursive verification in ::verify as this is very costly and ::verify gets called a lot. Others have done this heavy validation via separate validation passes (e.g. TOSA).

Would be good to discuss this op's designated uses to make sure we don't introduce backward compatibility later. WDYT?

Makes sense, yes.

Any other potential uses for this op?

I think we should be pragmatic here. A source file is a source file. If we need different more elaborate constructs, there is always the option to add new ops, or change existing onces. MLIR/emitc have not been backwards compatible, so we shouldn't make it too hard for us.

In summary, I see consensus to rename this to emitc.file.

Is there any objection to afterwards get this merged? I heard many ideas about possible extensions,
but I don't see how the current design (an operation with an identifier) would make any of them impossible.

marbre · 2025-01-23T14:19:46Z

Adding an additional attribute later would not break compatibility. And we don't know yet if we actually going to get this, right?

Yes, that sounds reasonable to me.

So I was thinking whether the op should have a ::verify() method. Unless the translator itself runs such a verifying pass (which I'm not sure translators are supposed to do) this pass would be optional, letting users call the translator on invalid files. If we implement ::verify() the op would have to be introduced after all lowering to emitc has been done. If we do this later on, we might be breaking downstream compilers which expected it to be usable earlier.

I'm opposed to recursive verification in ::verify as this is very costly and ::verify gets called a lot. Others have done this heavy validation via separate validation passes (e.g. TOSA).

Yes, with Adding a validation pass in my comment above I was referring to a validation pass like the one TOSA has to check for compliance with a TOSA spec. A downstream user would still need to run this pass (or add it to a pipeline) but that should be acceptable.

Any other potential uses for this op?

I think we should be pragmatic here. A source file is a source file. If we need different more elaborate constructs, there is always the option to add new ops, or change existing onces. MLIR/emitc have not been backwards compatible, so we shouldn't make it too hard for us.

👍

In summary, I see consensus to rename this to emitc.file.

Is there any objection to afterwards get this merged? I heard many ideas about possible extensions, but I don't see how the current design (an operation with an identifier) would make any of them impossible.

No objections. I would say, lets go ahead with emitc.file.

aniragil · 2025-01-26T16:22:10Z

Adding an additional attribute later would not break compatibility. And we don't know yet if we actually going to get this, right?

Yes, that sounds reasonable to me.

So I was thinking whether the op should have a ::verify() method. Unless the translator itself runs such a verifying pass (which I'm not sure translators are supposed to do) this pass would be optional, letting users call the translator on invalid files. If we implement ::verify() the op would have to be introduced after all lowering to emitc has been done. If we do this later on, we might be breaking downstream compilers which expected it to be usable earlier.

I'm opposed to recursive verification in ::verify as this is very costly and ::verify gets called a lot. Others have done this heavy validation via separate validation passes (e.g. TOSA).

Yes, with Adding a validation pass in my comment above I was referring to a validation pass like the one TOSA has to check for compliance with a TOSA spec. A downstream user would still need to run this pass (or add it to a pipeline) but that should be acceptable.

Any other potential uses for this op?

I think we should be pragmatic here. A source file is a source file. If we need different more elaborate constructs, there is always the option to add new ops, or change existing onces. MLIR/emitc have not been backwards compatible, so we shouldn't make it too hard for us.

👍

In summary, I see consensus to rename this to emitc.file.
Is there any objection to afterwards get this merged? I heard many ideas about possible extensions, but I don't see how the current design (an operation with an identifier) would make any of them impossible.

Yes, IIUC we're all OK with changing its behavior later as needed, so let's move forward.

No objections. I would say, lets go ahead with emitc.file.

Agreed.

A `emitc.file` represents a file that can be emitted into a single C++ file. This allows to manage multiple source files within the same MLIR module, but emit them into separate files. This feature is opt-in. By default, `mlir-translate` emits all ops outside of `emitc.file` and ignores all `emitc.file` ops and their bodies. When specifying the `-file-id=id` flag, `mlir-translate` emits all ops outside of `emitc.file` and the ops within the `emitc.file` with matching `id`. Example: ```mlir emitc.file "main" { func @func_one() { return } } emitc.file "test" { func @func_two() { return } } ``` `mlir-translate -file-id=main` will emit `func_one` and `mlir-translate -file-id=test` will emit `func_two`.

mgehre-amd · 2025-01-28T17:30:30Z

I updated the PR to emitc.file

marbre

Some quick comments.

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td

mlir/lib/Target/Cpp/TranslateToCpp.cpp

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td

mlir/lib/Target/Cpp/TranslateToCpp.cpp

mlir/include/mlir/Target/Cpp/CppEmitter.h

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td

mlir/lib/Target/Cpp/TranslateToCpp.cpp

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td

marbre

Thanks!

A `emitc.file` represents a file that can be emitted into a single C++ file. This allows to manage multiple source files within the same MLIR module, but emit them into separate files. This feature is opt-in. By default, `mlir-translate` emits all ops outside of `emitc.file` and ignores all `emitc.file` ops and their bodies. When specifying the `-file-id=id` flag, `mlir-translate` emits all ops outside of `emitc.file` and the ops within the `emitc.file` with matching `id`. Example: ```mlir emitc.file "main" { func @func_one() { return } } emitc.file "test" { func @func_two() { return } } ``` `mlir-translate -file-id=main` will emit `func_one` and `mlir-translate -file-id=test` will emit `func_two`.

mgehre-amd requested review from marbre, simon-camp and aniragil January 17, 2025 07:57

llvmbot added mlir mlir:emitc labels Jan 17, 2025

simon-camp reviewed Jan 17, 2025

View reviewed changes

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td Outdated Show resolved Hide resolved

mgehre-amd force-pushed the matthias.emitc_tu branch from e3524e5 to cdf3b0a Compare January 28, 2025 17:28

mgehre-amd changed the title ~~[MLIR] emitc: Add emitc translation unit op~~ [MLIR] emitc: Add emitc.file op Jan 28, 2025

mgehre-amd requested a review from simon-camp January 28, 2025 17:30

mgehre-amd force-pushed the matthias.emitc_tu branch from cdf3b0a to 58b6d86 Compare January 28, 2025 17:30

marbre requested changes Jan 29, 2025

View reviewed changes

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td Outdated Show resolved Hide resolved

mlir/lib/Target/Cpp/TranslateToCpp.cpp Outdated Show resolved Hide resolved

mlir/lib/Target/Cpp/TranslateToCpp.cpp Outdated Show resolved Hide resolved

Review comments

c7261f6

mgehre-amd requested a review from marbre January 30, 2025 09:31

simon-camp approved these changes Jan 30, 2025

View reviewed changes

aniragil requested changes Feb 2, 2025

View reviewed changes

marbre reviewed Feb 4, 2025

View reviewed changes

mlir/include/mlir/Dialect/EmitC/IR/EmitC.td Outdated Show resolved Hide resolved

Fix comments

50c3031

mgehre-amd requested a review from aniragil February 4, 2025 11:12

marbre approved these changes Feb 4, 2025

View reviewed changes

aniragil approved these changes Feb 15, 2025

View reviewed changes

mgehre-amd merged commit 4cc7d60 into llvm:main Feb 18, 2025
8 checks passed

mgehre-amd deleted the matthias.emitc_tu branch February 18, 2025 14:21

[MLIR] emitc: Add emitc.file op #123298

[MLIR] emitc: Add emitc.file op #123298

Uh oh!

Conversation

mgehre-amd commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 17, 2025

Uh oh!

llvmbot commented Jan 17, 2025

Uh oh!

simon-camp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simon-camp commented Jan 17, 2025

Uh oh!

marbre commented Jan 17, 2025

Uh oh!

mgehre-amd commented Jan 17, 2025

Uh oh!

mgehre-amd commented Jan 17, 2025

Uh oh!

aniragil commented Jan 17, 2025

Uh oh!

marbre commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgehre-amd commented Jan 20, 2025

Uh oh!

aniragil commented Jan 20, 2025

Uh oh!

marbre commented Jan 21, 2025

Uh oh!

marbre commented Jan 22, 2025

Uh oh!

aniragil commented Jan 23, 2025

Uh oh!

mgehre-amd commented Jan 23, 2025

Uh oh!

marbre commented Jan 23, 2025

Uh oh!

aniragil commented Jan 26, 2025

Uh oh!

mgehre-amd commented Jan 28, 2025

Uh oh!

marbre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

marbre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mgehre-amd commented Jan 17, 2025 •

edited

Loading

marbre commented Jan 17, 2025 •

edited

Loading