Finish gradient for tf.matMul #957

jgartman · 2018-04-15T17:47:41Z

This PR finishes the implementation of the gradient for tf.matMul. tensorflow/tfjs#108

This change is

dsmilkov · 2018-04-16T14:44:22Z

Thank you! One suggestion!

Reviewed 2 of 2 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.

src/ops/matmul.ts, line 65 at r1 (raw file):

          b: () => a.toFloat().matMul(dy, true, false)
        };
      } else if (!transposeA && transposeB) {

looks like you can have only one case instead of 4 cases, using the transposeA and transposeB flipped in the right now: (please double check me on this):

a: () => dy.matMul(b.toFloat(), transposeA, !transposeB),
b: () => a.toFloat().matMul(dy, !transposeA, transposeB);

Comments from Reviewable

jgartman · 2018-04-17T01:27:23Z

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.

src/ops/matmul.ts, line 65 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

looks like you can have only one case instead of 4 cases, using the transposeA and transposeB flipped in the right now: (please double check me on this):
a: () => dy.matMul(b.toFloat(), transposeA, !transposeB),
b: () => a.toFloat().matMul(dy, !transposeA, transposeB);

Thanks, I might be misunderstanding but I'm not sure that would work. For example, in the case of !transposeA && transposeBsuppose a has shape [3 2] and b has shape [1 2]. When we take the product ab^T the resulting matrix will have shape [3 1] so the gradient dy should also have shape [3 1]. If we calculate db by a^Tdy^T then we would be trying to multiply matrices with shapes [2 3][1 3]. Something similar would also happen in the transposeA && !transposeB case. Let me know if I'm missing something.

Comments from Reviewable

dsmilkov · 2018-04-17T01:35:20Z

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.

src/ops/matmul.ts, line 65 at r1 (raw file):

Previously, jgartman (Josh Gartman) wrote…

Thanks, I might be misunderstanding but I'm not sure that would work. For example, in the case of !transposeA && transposeBsuppose a has shape [3 2] and b has shape [1 2]. When we take the product ab^T the resulting matrix will have shape [3 1] so the gradient dy should also have shape [3 1]. If we calculate db by a^Tdy^T then we would be trying to multiply matrices with shapes [2 3][1 3]. Something similar would also happen in the transposeA && !transposeB case. Let me know if I'm missing something.

Ah you are completely right! I double checked. Thanks for being super careful!

Comments from Reviewable

dsmilkov · 2018-04-17T01:36:03Z

Really nice work. Thank you Josh!

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.

Comments from Reviewable

Finish gradient for tf.matMul

0b9e873

Merge branch 'master' into matMulGrad

74dba55

tensorflow deleted a comment from googlebot Apr 17, 2018

dsmilkov merged commit 1162d18 into tensorflow:master Apr 17, 2018

jgartman deleted the matMulGrad branch August 25, 2018 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish gradient for tf.matMul #957

Finish gradient for tf.matMul #957

jgartman commented Apr 15, 2018 •

edited by dsmilkov

Loading

dsmilkov commented Apr 16, 2018

jgartman commented Apr 17, 2018

dsmilkov commented Apr 17, 2018

dsmilkov commented Apr 17, 2018

Finish gradient for tf.matMul #957

Finish gradient for tf.matMul #957

Conversation

jgartman commented Apr 15, 2018 • edited by dsmilkov Loading

dsmilkov commented Apr 16, 2018

jgartman commented Apr 17, 2018

dsmilkov commented Apr 17, 2018

dsmilkov commented Apr 17, 2018

jgartman commented Apr 15, 2018 •

edited by dsmilkov

Loading