Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TOPI] Fix mali conv2d performance regression #3131

Merged
merged 3 commits into from
May 5, 2019

Conversation

merrymercy
Copy link
Member

@merrymercy merrymercy changed the title [TOPI] fix mali conv [TOPI] Fix mali conv2d performance regression May 2, 2019
+ tvm.const(0, out_dtype) * M[alpha-1][alpha-1][CO-1][P_round-1],
# thw following hack term is used to make the padding in batch gemm ("M")
# effective, otherwise the padding will be eliminated by bound inference
+ tvm.expr.Mul(tvm.const(0, out_dtype),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to leave a comment point to the issue #3088 so ppl understand why Mul instead of *

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused at why we need this multiplication?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thw -> the

Copy link
Member Author

@merrymercy merrymercy May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@icemelon9 During batch gemm, we introduce some padding to avoid partial tile, so we can safely vectorize the innermost loop. However, we won't use all the output of batch gemm (the padded part is ignored in final results). The InferBound pass in tvm analyzes computation region from output to input, and only keeps the necessary part. If we don't add this term, the padding added in batch gemm will be ignored, regardless of how we tweak the shape argument in tvm.compute.
This term accesses the last element in the padded buffer, so it makes all padding effective.

Copy link
Member Author

@merrymercy merrymercy May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yzliu tvm.expr.Mul won't do const fold, while * is equal to tvm.expr.Mul + const fold.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate in the comment by what you replied to @icemelon9 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too long to be put in the comment..

@merrymercy merrymercy assigned merrymercy and yzhliu and unassigned merrymercy May 4, 2019
@yzhliu yzhliu merged commit 88daa2b into apache:master May 5, 2019
@yzhliu
Copy link
Member

yzhliu commented May 5, 2019

Thanks @merrymercy @tqchen @eqy @icemelon9 for fixing and reviewing.

@merrymercy merrymercy deleted the fix-mali branch May 5, 2019 10:42
wweic pushed a commit to wweic/tvm that referenced this pull request May 13, 2019
* [TOPI] fix mali conv

* fix typo

* address comments
wweic pushed a commit to neo-ai/tvm that referenced this pull request May 13, 2019
* [TOPI] fix mali conv

* fix typo

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants