Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Unity] Legalization of cumsum op is broken #15851

Open
masahi opened this issue Oct 1, 2023 · 2 comments
Open

[Bug][Unity] Legalization of cumsum op is broken #15851

masahi opened this issue Oct 1, 2023 · 2 comments
Labels
branch: unity needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug

Comments

@masahi
Copy link
Member

masahi commented Oct 1, 2023

R.cumsum is legalized to topi.cumsum according to

https://github.com/apache/tvm/blob/unity/python/tvm/relax/transform/legalize_ops/statistical.py#L90-L92

But this is broken on GPU. topi.cumsum is one of topi ops that cannot be defined by te.compute and thus they are written using IR builder by hand, for CPU and GPU targets separately. Since topi.cumsum is meant to be used only for CPU targets, after legalization we get TIR that has a for loop with parallel kind, for example, even on GPU. Other common ops that would have the same issue include sorting etc.

So in this case, it would be more correct to use topi.cuda.cumsum for legalizing relax.cumsum, meaning the LegalizeOp pass needs to be target dependent. But I think that would break the original semantics of LegalizeOp. DefaultGPUSchedule pass also needs to ignore such legalized ops carefully since they are already scheduled for GPU by hand.

I think we need to discuss what the Relax compilation story is going to be for such ops that were previously implemented using te.extern.

cc @Hzfengsy @MasterJH5574 @quic-sanirudh @tqchen

@masahi masahi added type: bug needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it branch: unity labels Oct 1, 2023
@tqchen
Copy link
Member

tqchen commented Oct 2, 2023

In this particular case, we should build up a tensor ir of cumsum, and then bring dlight schedule for them

@tqchen
Copy link
Member

tqchen commented Nov 7, 2023

opens up a discussion thread in https://discuss.tvm.apache.org/t/discuss-default-compilation-flow-for-scan-and-sort/15949

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch: unity needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it type: bug
Projects
None yet
Development

No branches or pull requests

2 participants