Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Finish gradient for tf.matMul #957

Merged
merged 2 commits into from
Apr 17, 2018
Merged

Conversation

jgartman
Copy link
Contributor

@jgartman jgartman commented Apr 15, 2018

This PR finishes the implementation of the gradient for tf.matMul. tensorflow/tfjs#108


This change is Reviewable

@dsmilkov
Copy link
Contributor

Thank you! One suggestion!


Reviewed 2 of 2 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.


src/ops/matmul.ts, line 65 at r1 (raw file):

          b: () => a.toFloat().matMul(dy, true, false)
        };
      } else if (!transposeA && transposeB) {

looks like you can have only one case instead of 4 cases, using the transposeA and transposeB flipped in the right now: (please double check me on this):

a: () => dy.matMul(b.toFloat(), transposeA, !transposeB),
b: () => a.toFloat().matMul(dy, !transposeA, transposeB);

Comments from Reviewable

@jgartman
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


src/ops/matmul.ts, line 65 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

looks like you can have only one case instead of 4 cases, using the transposeA and transposeB flipped in the right now: (please double check me on this):

a: () => dy.matMul(b.toFloat(), transposeA, !transposeB),
b: () => a.toFloat().matMul(dy, !transposeA, transposeB);

Thanks, I might be misunderstanding but I'm not sure that would work. For example, in the case of !transposeA && transposeBsuppose a has shape [3 2] and b has shape [1 2]. When we take the product ab^T the resulting matrix will have shape [3 1] so the gradient dy should also have shape [3 1]. If we calculate db by a^Tdy^T then we would be trying to multiply matrices with shapes [2 3][1 3]. Something similar would also happen in the transposeA && !transposeB case. Let me know if I'm missing something.


Comments from Reviewable

@dsmilkov
Copy link
Contributor

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


src/ops/matmul.ts, line 65 at r1 (raw file):

Previously, jgartman (Josh Gartman) wrote…

Thanks, I might be misunderstanding but I'm not sure that would work. For example, in the case of !transposeA && transposeBsuppose a has shape [3 2] and b has shape [1 2]. When we take the product ab^T the resulting matrix will have shape [3 1] so the gradient dy should also have shape [3 1]. If we calculate db by a^Tdy^T then we would be trying to multiply matrices with shapes [2 3][1 3]. Something similar would also happen in the transposeA && !transposeB case. Let me know if I'm missing something.

Ah you are completely right! I double checked. Thanks for being super careful!


Comments from Reviewable

@dsmilkov
Copy link
Contributor

:lgtm_strong: Really nice work. Thank you Josh!


Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


Comments from Reviewable

@tensorflow tensorflow deleted a comment from googlebot Apr 17, 2018
@dsmilkov dsmilkov merged commit 1162d18 into tensorflow:master Apr 17, 2018
@jgartman jgartman deleted the matMulGrad branch August 25, 2018 17:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants