Skip to content

[tir] Add fake support for 'call_tir' op#12973

Closed
cconvey wants to merge 1 commit intoapache:mainfrom
cconvey:call-tir-node
Closed

[tir] Add fake support for 'call_tir' op#12973
cconvey wants to merge 1 commit intoapache:mainfrom
cconvey:call-tir-node

Conversation

@cconvey
Copy link
Contributor

@cconvey cconvey commented Oct 3, 2022

Allow TVMScript to contain T.call_tir(...) calls. This is a small step toward a complete implementation of the 'call_tir' functionality. The op does nothing, and its arguments are not evaluated at runtime.

@cconvey cconvey changed the title [tir] Add fake support for 'call_tir' op. [tir] Add fake support for 'call_tir' op Oct 3, 2022
@Lunderberg
Copy link
Contributor

@tvm-bot rerun

Allow TVMScript to contain `T.call_tir(...)` calls.
This is a small step toward a complete implementation of
the 'call_tir' functionality.  The op does nothing, and
its arguments are not evaluated at runtime.
@Mousius
Copy link
Member

Mousius commented Oct 4, 2022

Hi @cconvey,

What's the reason we can't implement this now as an Allocate and a Call?

@cconvey
Copy link
Contributor Author

cconvey commented Oct 4, 2022

What's the reason we can't implement this now as an Allocate and a Call?

Thanks for the idea @Mousius ! Would you mind elaborating?

It's entirely possible your idea makes sense, but I'm new to this area of the code, so some more detail would be helpful.

@Mousius
Copy link
Member

Mousius commented Oct 5, 2022

What's the reason we can't implement this now as an Allocate and a Call?

Thanks for the idea @Mousius ! Would you mind elaborating?

It's entirely possible your idea makes sense, but I'm new to this area of the code, so some more detail would be helpful.

I believe this is implementing the same call_tir as:
https://github.com/apache/tvm-rfcs/pull/89/files#r949982699

Which would mean that for a short-term solution, instead of a no-op, you could rewrite it to allocate the buffer, here's my pseudo-code:

int size_of_output = FindSizeOfRet(prim_func->ret_type); // Can't remember a good function off the top of my head
Var tmp_buffer("tmp_buffer");

// These need to be injected before the call somehow, unsure how?
add_before = SeqStmt([
   Let(tmp_buffer, decl_buffer(size_of_output)),
   Call(prim_func, ...original args..., tmp_buffer);
]);

return tmp_buffer; // Replace Call with a the buffer?

The reason I suggest this approach is that provided you rewrote the original call_tir site to use the allocated buffer, that'd mean you could implement all your features with the "allocation and destination passing" strategy instead of having to land them without functionally?

@Lunderberg
Copy link
Contributor

Hey @Mousius . There's a couple of intended differences between the TIR-level call_tir functionality being developed here and the Relax-level call_tir functionality in the Relax proposal.

  • Could occur within a PrimFunc, rather than being in the dataflow block of a Relax function.
  • Doesn't perform any allocation of the destination tensor, nor assume that the TIR function uses destination-passing style.

This could be represented at the TIR level using either builtin::tvm_call_packed, builtin::tvm_call_cpacked, or builtin::call_extern, but those would have their own drawbacks.

  • tvm_call_packed uses a name lookup to the TVM registry. The T.call_tir would not have this level of indirection. In addition, the tvm_call_packed is implemented using by resuing a local stack arguments. The T.call_tir must be extractable into a separate context (e.g. for pipelining across threads), and cannot reuse the same stack.
  • tvm_call_cpacked avoids the name lookup of tvm_call_packed, but still has the stack-based arguments.
  • builtin::call_extern comes closest to what we allows calls into functions defined outside the IRModule (e.g. calls to printf). The T.call_tir would be restricted to making calls within the same IRModule. That is, it would not allow the fallback to llvm::Function::ExternalLinkage that is used for call_extern.

Ideally, we'd want something that has the following properties

  • Only allows calls within the same IRModule.
  • Can occur within a PrimFunc as the result of lowering a single operator.
  • Operates in terms of a tir::Buffer, to allow it to occur at an earlier stage of TIR lowering, but avoid breaking passes that occur in-between.
  • Can have compile-time validation of the shape of a buffer region being passed by the caller, and the expected shape in the PrimFunc::buffer_map of the callee.

We're in the process of seeing how much of a TIR-level T.call_tir implementation could be implemented without introducing new TIR concepts, sketching out what changes would need to be proposed by RFC.

@Mousius
Copy link
Member

Mousius commented Oct 5, 2022

Ahh, sorry @Lunderberg, two things named the same and all that 😸 in which case would it be useful to map this to call_extern for now and add validation against the IRModule later? I'm just wondering if we can implement something so that you can see it tested and working 🤔

@cconvey
Copy link
Contributor Author

cconvey commented Oct 5, 2022

@tvm-bot rerun

Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lunderberg / @cconvey happy for this to go in as-is 😸

@areusch areusch added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it and removed needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it labels Oct 19, 2022
@Mousius
Copy link
Member

Mousius commented Feb 27, 2023

@cconvey I think this has been superseded, can we close?

@Mousius
Copy link
Member

Mousius commented Mar 6, 2023

Closing for now, re-open if necessary 😸

@Mousius Mousius closed this Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants