Skip to content

Conversation

@bethebunny
Copy link
Contributor

Tablegen builders generated for InferTypeOpTrait currently generate a call to llvm::report_fatal_error. This prevents error recovery even though the underlying machinery, including the generic builder, support failing result type inference.

This PR updates reportFatalInferReturnTypesError -> emitInferReturnTypesError, using mlir::emitError() instead of llvm::report_fatal_error, and then updates OpDefinitionsGen to use this function.

@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Oct 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 28, 2025

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-core

Author: Stef Lindall (bethebunny)

Changes

Tablegen builders generated for InferTypeOpTrait currently generate a call to llvm::report_fatal_error. This prevents error recovery even though the underlying machinery, including the generic builder, support failing result type inference.

This PR updates reportFatalInferReturnTypesError -> emitInferReturnTypesError, using mlir::emitError() instead of llvm::report_fatal_error, and then updates OpDefinitionsGen to use this function.


Full diff: https://github.com/llvm/llvm-project/pull/165488.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Interfaces/InferTypeOpInterface.h (+2-3)
  • (modified) mlir/lib/Interfaces/InferTypeOpInterface.cpp (+8-11)
  • (modified) mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp (+1-1)
diff --git a/mlir/include/mlir/Interfaces/InferTypeOpInterface.h b/mlir/include/mlir/Interfaces/InferTypeOpInterface.h
index 4fcbeff9df560..3cae45e17a15b 100644
--- a/mlir/include/mlir/Interfaces/InferTypeOpInterface.h
+++ b/mlir/include/mlir/Interfaces/InferTypeOpInterface.h
@@ -245,9 +245,8 @@ inferReturnTensorTypes(ArrayRef<ShapedTypeComponents> retComponents,
 /// the op. Precondition: op implements InferTypeOpInterface.
 LogicalResult verifyInferredResultTypes(Operation *op);
 
-/// Report a fatal error indicating that the result types could not be
-/// inferred.
-void reportFatalInferReturnTypesError(OperationState &state);
+/// Report an error indicating that the result types could not be inferred.
+void emitInferReturnTypesError(OperationState &state);
 } // namespace detail
 
 namespace OpTrait {
diff --git a/mlir/lib/Interfaces/InferTypeOpInterface.cpp b/mlir/lib/Interfaces/InferTypeOpInterface.cpp
index 9f4f672fb9f4d..add0101f226d7 100644
--- a/mlir/lib/Interfaces/InferTypeOpInterface.cpp
+++ b/mlir/lib/Interfaces/InferTypeOpInterface.cpp
@@ -240,15 +240,12 @@ LogicalResult mlir::detail::verifyInferredResultTypes(Operation *op) {
   return result;
 }
 
-void mlir::detail::reportFatalInferReturnTypesError(OperationState &state) {
-  std::string buffer;
-  llvm::raw_string_ostream os(buffer);
-  os << "Failed to infer result type(s):\n"
-     << "\"" << state.name << "\"(...) "
-     << state.attributes.getDictionary(state.location.getContext()) << " : ("
-     << llvm::interleaved(llvm::map_range(
-            state.operands, [](Value val) { return val.getType(); }))
-     << ") -> ( ??? )";
-  emitRemark(state.location, "location of op");
-  llvm::report_fatal_error(llvm::StringRef(buffer));
+void mlir::detail::emitInferReturnTypesError(OperationState &state) {
+  mlir::emitError(state.location)
+      << "Failed to infer result type(s):\n"
+      << "\"" << state.name << "\"(...) "
+      << state.attributes.getDictionary(state.location.getContext()) << " : ("
+      << llvm::interleaved(llvm::map_range(
+             state.operands, [](Value val) { return val.getType(); }))
+      << ") -> ( ??? )";
 }
diff --git a/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp b/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
index 371864830a3c1..d75261186f45a 100644
--- a/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
+++ b/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
@@ -2681,7 +2681,7 @@ void OpEmitter::genSeparateArgParamBuilder() {
                       {1}.regions, inferredReturnTypes)))
           {1}.addTypes(inferredReturnTypes);
         else
-          ::mlir::detail::reportFatalInferReturnTypesError({1});
+          ::mlir::detail::emitInferReturnTypesError({1});
         )",
                       opClass.getClassName(), builderOpState);
       return;

Copy link
Member

@jpienaar jpienaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MLIR build calls with invalid inputs aren't expected to pass/will crash in many cases. If the type inference fails it's due to invalid config and there should be no build: nothing is checking for failure today and it would silently result in invalid op's state being propagated.

What folks do here sometimes is two step, infer result type and then build (if valid) with explicit type. That could be made into a helper.

{1}.addTypes(inferredReturnTypes);
else
::mlir::detail::reportFatalInferReturnTypesError({1});
::mlir::detail::emitInferReturnTypesError({1});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens post this in error case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to the usage, similar to calling op.verify

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the caller in this context supposed to know that there was an error?
You emit a diagnostic but don't propagate any error state do you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I'm following the same pattern used in <Type>::getChecked and <Type>::parse for instance, which afaict expect you to check whether a diagnostic was emitted to signal an error state. I don't think this is a great pattern necessarily, we can do something else, but it's the pattern that seems to (1) have prior use in the codebase and (2) appears to locally best fit the needs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no: getChecked returns a null Attribute (or Type) which you can check. This is not the case here, the MLIR builder API has no provision to fail building an operation, and your patch does not do this right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. What is your recommendation here? Generally I would gravitate towards patterns like FailureOr but as you say the builder API doesn't have a clear place to put this.

@bethebunny
Copy link
Contributor Author

bethebunny commented Oct 29, 2025

The context here is using MLIR through FFI, in particular I heavily use MLIR through Python bindings. When MLIR asserts, it crashes the python interpreter with no stack trace, so instead for op creation failures we take the diagnostics and raise them as a Python exception.

For instance the generic Operation::create doesn't assert. It's generally not fallible, the contract as I understand it is to create the op and then call verify which uses emitError to report failures.

In the MLIR C API, since inferring result types may fail, the contract is to return a null Operation rather than assert.

What folks do here sometimes is two step, infer result type and then build (if valid) with explicit type. That could be made into a helper.

I think that's exactly what the generated builders for inferred result types do. My goal here is to be able to use the generated builders rather than expect to have to explicitly write all builders for inferred result types. Additionally we sometimes define custom builders which call generated builders, and we would need to take care to never call generated builders in that case.

@jpienaar
Copy link
Member

The context here is using MLIR through FFI, in particular I heavily use MLIR through Python bindings. When MLIR asserts, it crashes the python interpreter with no stack trace, so instead for op creation failures we take the diagnostics and raise them as a Python exception.

That should already be happening https://github.com/llvm/llvm-project/blame/9d1b6eec60490c09ab8367667fb70637695179c3/mlir/lib/Bindings/Python/IRCore.cpp#L1529 . If you have example where that isn't we should fix it.

I think that's exactly what the generated builders for inferred result types do.

Yes I wrote that (41a73dd) :)

For instance the generic Operation::create doesn't assert.

If you have a NamedAttrList with repeated keys, it will assert fail.

the contract as I understand it is to create the op and then call verify which uses emitError to report failures.

In general yes. The equivalent here would be for type inference to succeed but return an ErrorType that then fails verification of the trait. The problem with type though it is pervasive and the created op's output can be used as input when creating the next (if null, segfault when dereferenced while creating the next) and then you have potentially far propagation of invalid state, which complicates debugging.

In the MLIR C API, since inferring result types may fail, the contract is to return a null Operation rather than assert.

Yes, this is returning a Operation* effectively rather than FooOp view. You'll see no one checks if a C++ Op::create returned null or functions that take it as input checks. So it would result in silent corruption/error at distance vs today as everyone expects it doesn't get to that.

My goal here is to be able to use the generated builders rather than expect to have to explicitly write all builders for inferred result types

You don't need to though, I'm saying you can do something like

template <class T>
FailureOr<T> createFailable(...) { ... }

and then do exactly the same as is done in Python, and this just forwards to the underlying generated builders, so nothing extra to write.

@bethebunny
Copy link
Contributor Author

bethebunny commented Oct 29, 2025

If you have example where that isn't we should fix it.

It's not in precisely the place I'm updating.

nb::class_<FooOp>(m, "FooOp")
  .def_static("create", nb::overload_cast<OpBuilder&, Location>(&FooOp::create))

If you have a NamedAttrList with repeated keys, it will assert fail.

I mean for failures in inferring result types

Yes, this is returning a Operation* effectively rather than FooOp view. You'll see no one checks if a C++ Op::create returned null or functions that take it as input checks. So it would result in silent corruption/error at distance vs today as everyone expects it doesn't get to that.

This doesn't sound like how I'm using MLIR today at all :) but I acknowledge this is a common pattern and I don't want to break folk's existing usages. In all our usages emitError() signals a failure out-of-band, and that seems to be a common pattern upstream also.

You don't need to though, I'm saying you can do something like

template <class T>
FailureOr<T> createFailable(...) { ... }

and then do exactly the same as is done in Python, and this just forwards to the underlying generated builders, so nothing extra to write.

I think this has the usage pattern backwards. I'm binding the C++ op builder calls, whether they're generated default builders, or specified via interface / c++ class definition, or have custom implementations etc. I can do this just fine in the normal case, however any of these builders which transitively calls a generated builder that can fatal error is poison and I suddenly can't bind against. This means I have to rewrite custom implementations of both the generated op builders and any op builders which happen to depend on those.

@bethebunny
Copy link
Contributor Author

If there's concern about breaking existing users who are relying on fatal errors, maybe it makes sense to generate createChecked variants, similar to getChecked builders for attrs and types? I generate bindings for these on the attr/type side rather than the unchecked variants for a similar reason

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mlir:core MLIR Core Infrastructure mlir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants