Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw better error message when not returning a tensor or tuple of tensors #1491

Closed
dellis23 opened this issue Oct 13, 2022 · 6 comments · Fixed by #1509
Closed

Throw better error message when not returning a tensor or tuple of tensors #1491

dellis23 opened this issue Oct 13, 2022 · 6 comments · Fixed by #1509
Assignees

Comments

@dellis23
Copy link
Collaborator

See #1471 for background

@dellis23 dellis23 self-assigned this Oct 13, 2022
@dellis23
Copy link
Collaborator Author

@silvasean

Re: the case we talked about today, of returning a single-item tuple. The following error is thrown:

Traceback (most recent call last):
  File "/usr/local/google/home/danielellis/torch-mlir/repro.py", line 40, in <module>
    main()
  File "/usr/local/google/home/danielellis/torch-mlir/repro.py", line 32, in main
    linalg_on_tensors_mlir = torch_mlir.compile(
  File "/usr/local/google/home/danielellis/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/__init__.py", line 274, in compile
    run_pipeline_with_repro_report(
  File "/usr/local/google/home/danielellis/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/compiler_utils.py", line 73, in run_pipeline_with_repro_report
    raise TorchMlirCompilerError(trimmed_message) from None
torch_mlir.compiler_utils.TorchMlirCompilerError: Lowering TorchScript IR -> Torch Backend IR failed with the following diagnostics:
error: failed to legalize unresolved materialization from '!torch.tuple<tensor>' to '!torch.tensor' that remained live after conversion
note: see current operation: %2 = "builtin.unrealized_conversion_cast"(%1) : (!torch.tuple<tensor>) -> !torch.tensor
note: see existing live user here: "func.return"(%2) : (!torch.tensor) -> ()

We talked about putting another check just before the success here. We never actually reach this line, however. We end up failing in the check before, in the call to applyPartialConversion.

@ramiro050
Copy link
Collaborator

I think the issue is this type conversion:

typeConverter.addConversion(
[](Torch::TupleType type,
SmallVectorImpl<Type> &types) -> Optional<LogicalResult> {
llvm::append_range(types, type.getContainedTypes());
return success();
});

It should not succeed if the tuple only has one element in it.

Another point that might be important to look at is:

if (auto tuple = type.dyn_cast<Torch::TupleType>()) {
llvm::append_range(newResultTypes, tuple.getContainedTypes());
continue;
}

That conversion should also result in an error if the tuple has one element in it.

@ramiro050
Copy link
Collaborator

Actually, I think the conversions should simply return the tuple type when the tuple has one element, rather than throwing an error, since there are valid cases where we should have a function returning a single element tuple. For example:

def foo(x):
    return (x,)

def forward(x):
    return foo(x)[0]

@silvasean
Copy link
Contributor

silvasean commented Oct 19, 2022

Actually, I think the conversions should simply return the tuple type when the tuple has one element, rather than throwing an error, since there are valid cases where we should have a function returning a single element tuple. For example:

Strictly speaking, we need to distinguish between externally-callable functions and private functions. AdjustCallingConventions is only about externally-callable functions, and for those, a single-element tuple is not a valid return type (it will not satisfy the backend contract). (it makes sense to catch that as soon as possible)

@ramiro050
Copy link
Collaborator

AdjustCallingConventions is only about externally-callable functions, and for those, a single-element tuple is not a valid return type (it will not satisfy the backend contract).

Right, that's a good point. I was thinking about the example I mentioned above because of this lit test replacing !torch.tuple<tensor, tensor> with !torch.tuple<tensor>:

func.func @call_tuple_return(%arg0: !torch.tensor {torch.type_bound = !torch.vtensor<[?],f32>},
%arg1: !torch.tensor {torch.type_bound = !torch.vtensor<[?],f32>}) -> !torch.tuple<tensor, tensor> {
%0 = call @tuple_return(%arg0, %arg1) : (!torch.tensor, !torch.tensor) -> !torch.tuple<tensor, tensor>
return %0 : !torch.tuple<tensor, tensor>
}
. But yeah, in that test case, because both functions are public, it makes sense that both should be expected to not return a single element tensor.

Strictly speaking, we need to distinguish between externally-callable functions and private functions.

Is this something that is already being done? The AdjustCallingConventions seems to apply to all func::FuncOps in the module. (Although we do inline things before that pass, which gets rid of private functions that might return a single element tuple).

@silvasean
Copy link
Contributor

Is this something that is already being done? The AdjustCallingConventions seems to apply to all func::FuncOps in the module. (Although we do inline things before that pass, which gets rid of private functions that might return a single element tuple).

I suspect what we are doing now "happens to be sound", but yeah, AdjustCallingConventions probably should be explicitly checking public/private in theory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants