-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wiring up the PrimFunc resource_handle #34
Conversation
A friendly ping! |
[drawbacks]: #drawbacks | ||
|
||
* This changes the intrinsic `call_packed` to assume the final argument is the `resource_handle`, making it incompatible with previous releases | ||
* The lack of structure in the `call_packed` means it'll be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you finish the sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the VM alloc_storage is represented as a 'call' using an opaque StorageHandle (or something like that) type. Can we fold the additional arg into the regular params using that trick to distinguish device handles from the normal TIR types? I'm wary of both hidden args and the conceptual overhead of more fields in PrimFunc.
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
* Introduce another intrinsic and matching `call_cpacked` which have the suffix `_with_resource_handle` or similar - this means each code generator would have to implement additional intrinsics to support `resource_handle`s fully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this kind of munges the definition of CallNode
's args
array--the idea is that this array directly forms the TVMValue
passed to arg_values
in the TVMBackendPackedCFunc
. At the very least, is it possible we can wrap resource_handle
with a sentinel PrimExpr
or dtype
? i'd like to avoid a secret convention that call_packed
includes a special final arg.
cc @mbs-octoml @jroesch @junrushao1994 any suggestions? it's kind of awkward since call_packed
doesn't have its own CallNode or object which can be extended.
Will comment more later. One thing I want to note is that resource handle is a special thing for PackedCFunc signature and do not generally exists in a normal packed func(that is backed by say a GPU driver closure). As a result, we might want to think a bit more carefully on the calling convention here. My recommendation would still be try to hide this variables from the PrimFunc interface and leverage intrinsics(such as get resource handle) to obtain the handle from the environment. We might be able to introduce call packed c func with handle intrinsic that passes the handle around packed c func |
okay I've done a bit more research on this one. Here is one source of tension here:
This really has quite a narrow use case: in the C runtime with the AOT executor and when the C interface API is used. Other use cases disqualify this usage pattern:
Taken together, this means we need the following restrictions on this usage pattern:
Considering this altogether, this is essentially an attempt to simplify the multi-
Since ultimately the driver must be written knowing where the "context" is going to be (either as Ultimately at tension here is |
@mbs-octoml @electriclillies and I discussed this at length and we followed up with @Mousius offline as well. While it is tempting to re-use Given this, it seems like the best route is just to add it to the arguments list. @Mousius is going to try this and see if we run in to any other issues. According to @Mousius , past attempts to fully encapsulate the |
Thanks @areusch for a great summary, indeed this is a situation where explicit could be better. Something in the middle ground might be allow packed_c function to have an intrinsic to get such context from the env. |
@tqchen in this case we prefer not to pay the cost of retrieving such a pointer as tracking global state on an µC can be tricky. When the pointer is on the stack, it is easier to leverage argument-passing optimizations supplied by the architecture such as register-mapped arguments. |
i agree, i mean that the intrinsic translate to the implicit ctx parameters as part of fn call |
hey @Mousius what's the status on this PR? is it stale/shall we close it? |
Eek! Forgot to close this, we merged a magic variable name version of this in https://github.com/apache/tvm/pull/9501/files#diff-a9ac007ffdd51c0bc3fb5780835e17ca92f76fdb775b09c391375247940548ec which didn't have the same impact on the IR. |
No description provided.