-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TIR] Use IndexMap to transform NDArray #12949
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a fantastic addition to the API, and very useful functionality. My one question is whether we want to use DeviceAPI::CopyDataFromTo
to avoid the round trip from device to host and back.
In addition to the AllocateConst
usecase you mentioned, it would also be very useful in preparing input/output/expected buffers for unit tests, rather than needing to specify the transformation logic in both IndexMap
and np.transpose
calls.
@Lunderberg I've just looked into this possibility, I wish I could use tvm/include/tvm/runtime/device_api.h Line 230 in 8d60b3c
Other variants of |
Rats. I figured it was worth a shot. Two additional questions coming to mind:
|
I agree that (1) is possible, but making such small copies on the device side sounds very slow (e.g. GPU), which would defeat the purpose of removing the host - device round trip. On (2), it sounds like it would add too much complexity to the otherwise very simple code. I'm not sure how such "contiguous region detection" is effective in practice, but if @vinx13 and @junrushao also think it's a good idea, I'm happy to explore this approach. |
This is definitely cool Masa! Similar functionality existed in Relay's |
It sounds a bit more complicated than the PR is supposed to be. In light that we have |
Good points. I'd been thinking in terms of a small number of discontiguous breaks between largely contiguous regions, but that probably would be rather rare. Agreed that it isn't worth the extra complexity. |
@Lunderberg @vinx13 @junrushao Can we merge this? I have another PR ready to be sent that depends on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. LGTM!
I've hit a weird use case where I want to manually transform `runtime::NDArray` (attached to `AllocateConst` node) according to the index map used in `transform_layout`. This is needed to support `AllocateConst` node in Metaschedule `RewriteLayout` postproc. I can define it as a free function in the file where it is actually used. Having it available as part of the `IndexMap` interface makes it convenient to expose this to python and unit-test it. Let me know if this is a reasonable API addition.
I've hit a weird use case where I want to manually transform
runtime::NDArray
(attached toAllocateConst
node) according to the index map used intransform_layout
. This is needed to supportAllocateConst
node in MetascheduleRewriteLayout
postproc.I can define it as a free function in the file where it is actually used. Having it available as part of the
IndexMap
interface makes it convenient to expose this to python and unit-test it. Let me know if this is a reasonable API addition.cc @vinx13 @Lunderberg @junrushao