[WebGL] Support packed Conv2DBackpropInput #7339

Linchenn · 2023-02-04T00:19:31Z

With this PR:
Conv2DBackpropInput op is accelerated ~2x, partially fixed #5197.

Model	Before(ms)	After(ms)
BlazePoseDetector	23.1	17.5
ArPortraitDepth	52.3	23.9 (also set EXP_CONV to true)

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.

This change is

…o conv2dbackprop

qjia7

Clever algorithm and great work!

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts

pyu10055

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @Linchenn)

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts line 55 at r1 (raw file):

        //intialize dotProd with a small epsilon seems to reduce GPU accuracy loss.
        vec4 dotProd = vec4(0.000000000000001);

use an JS constant to interpolate in this value.

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts line 90 at r1 (raw file):

              dotProd.xy += vec2(dot(dyValue, wValue.xy), dot(dyValue, wValue.zw)) * idyCVal;

              dySample = getDy(batch, idyR, idyC2, d2);

might be good to move all memory accesses together.

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts line 90 at r1 (raw file):

              dotProd.xy += vec2(dot(dyValue, wValue.xy), dot(dyValue, wValue.zw)) * idyCVal;

              dySample = getDy(batch, idyR, idyC2, d2);

can idyC2 could be the same as idyC ? only read again if idyC2 != idyC ?

Linchenn

Thank you Ping. After tuning the shader, the inference time for the following op is improved from ~2.8928ms to ~2.6809903999999984ms on a Linux workstation:

const x = tf.ones([1,8,6,256]);
const w = tf.ones([4,4,256,256]);

s = (await tf.profile(()=>tf.conv2dTranspose(x, w, [1,16,12,256], 1, 'valid'))).kernels[0].kernelTimeMs

After benchmarking, the improvement is mainly from add a condition for reading dySample2, from ~2.8928ms to ~2.711304960000001ms

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @mattsoulanille and @pyu10055)

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts line 55 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

use an JS constant to interpolate in this value.

Removed it as we discussed, because this initial value currently is unnecessary, but we could keep this in mind when facing some correctness issues later.

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts line 90 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

can idyC2 could be the same as idyC ? only read again if idyC2 != idyC ?

Done.

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts line 90 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

might be good to move all memory accesses together.

Done.

pyu10055

Reviewed 1 of 2 files at r1, 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @mattsoulanille)

Linchenn added 4 commits February 3, 2023 16:09

pack

29d34e5

sss

a8717d3

format

36a9896

Merge branch 'conv2dbackprop' of https://github.com/Linchenn/tfjs int…

9822670

…o conv2dbackprop

Linchenn requested review from pyu10055 and qjia7 February 4, 2023 00:55

Merge branch 'master' into conv2dbackprop

062a3f5

qjia7 approved these changes Feb 6, 2023

View reviewed changes

tfjs-backend-webgl/src/conv_backprop_packed_gpu.ts Outdated Show resolved Hide resolved

pyu10055 requested changes Feb 6, 2023

View reviewed changes

Linchenn added 3 commits February 6, 2023 10:24

Update conv_backprop_packed_gpu.ts

4d24ae4

tune shader

db38149

remove ini value for dotProd

6500652

Linchenn commented Feb 6, 2023

View reviewed changes

Linchenn and others added 2 commits February 6, 2023 14:05

remove dotProd

c4f5928

Merge branch 'master' into conv2dbackprop

69fc1d7

pyu10055 approved these changes Feb 6, 2023

View reviewed changes

Linchenn merged commit 021942f into tensorflow:master Feb 6, 2023

Linchenn deleted the conv2dbackprop branch February 6, 2023 22:52

gaikwadrahul8 mentioned this pull request Jun 5, 2023

[Perf] The time of Conv2DBackpropInput is very long in BlazePose/hand_detector models in WebGL #5197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WebGL] Support packed Conv2DBackpropInput #7339

[WebGL] Support packed Conv2DBackpropInput #7339

Uh oh!

Linchenn commented Feb 4, 2023 •

edited

Loading

Uh oh!

qjia7 left a comment

Uh oh!

Uh oh!

pyu10055 left a comment

Uh oh!

Linchenn left a comment

Uh oh!

pyu10055 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WebGL] Support packed Conv2DBackpropInput #7339

[WebGL] Support packed Conv2DBackpropInput #7339

Uh oh!

Conversation

Linchenn commented Feb 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qjia7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

Linchenn left a comment

Choose a reason for hiding this comment

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Linchenn commented Feb 4, 2023 •

edited

Loading