New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
depthwise packed perf: reduce texel read for dilation of 2 #4954
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Curious, how much overhead is saved by reusing the same vertex shader?
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @pyu10055)
tfjs-backend-webgl/src/conv_packed_gpu_depthwise.ts, line 78 at r1 (raw file):
`; for (let texelC = 0; texelC < (texelsAcross + 1) / 2; texelC++) {
Just want to confirm this change changes the loop rounds, for example, if texelsAcross = 4, before was texelC < 3, now is texelC < 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, there seems to be very minimal saving on the first inference.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @lina128)
tfjs-backend-webgl/src/conv_packed_gpu_depthwise.ts, line 78 at r1 (raw file):
Previously, lina128 (Na Li) wrote…
Just want to confirm this change changes the loop rounds, for example, if texelsAcross = 4, before was texelC < 3, now is texelC < 2.
yes, used to generate extra loops that is not doing anything. Since in each loop it generates two values.
Use flag to record if the texel has be read and ready to be reused. This would minimize unnecessary reading for all conditions.
Aligned the texel naming with the index.
reuse the same vertex shader for all GPGPU programs.
Verified the result of this change with the ssd colab.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is