Skip to content

Conversation

@lina128
Copy link
Collaborator

@lina128 lina128 commented May 7, 2020

We have to do the transformation outside because the kernel api expects Tensor3D. The c++ api doesn't have this restriction. We can consider changing the api in a separate PR.


This change is Reviewable

@lina128 lina128 requested review from annxingyuan and tafsiri May 7, 2020 03:27
Copy link
Contributor

@tafsiri tafsiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, great work. I would wait on @annxingyuan review and her thoughts on my question on this one (unless you already know the answer).

Reviewed 17 of 17 files at r1.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, and @tafsiri)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

export type BatchMatMulInputs = Pick<NamedTensorInfoMap, 'a'|'b'>;
export interface BatchMatMulAttrs {
  transposeA: boolean;

I noticed that the attr in C++ api is adj_x and adj_y (https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/batch-mat-mul). Which says it will adjoint (transpose and conjugate), the slices of x and y respectively. Here we call it transpose. Do we know if we are doing the adjoint but just referring to it as transpose or doing just the transpose bit. If the latter than we should leave this as is.

If we are actually implementation wise doing the adjoint then we can change these attribute names (and input names) to match the C++ api.

@annxingyuan it looks like you wrote a lot of batchMatMul implementation so you might know the answer to this.

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

I noticed that the attr in C++ api is adj_x and adj_y (https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/batch-mat-mul). Which says it will adjoint (transpose and conjugate), the slices of x and y respectively. Here we call it transpose. Do we know if we are doing the adjoint but just referring to it as transpose or doing just the transpose bit. If the latter than we should leave this as is.

If we are actually implementation wise doing the adjoint then we can change these attribute names (and input names) to match the C++ api.

@annxingyuan it looks like you wrote a lot of batchMatMul implementation so you might know the answer to this.

I also have this question, it seems that the argument names align with MatMul. But from converter, it seems that we are mapping MatMul, BatchMatMul, and BatchMatMulV2 to this BatchMatMul. V2 does broadcasting. Curious which implementation we actually have. @annxingyuan

Copy link
Contributor

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan and @pyu10055)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

Previously, lina128 (Na Li) wrote…

I also have this question, it seems that the argument names align with MatMul. But from converter, it seems that we are mapping MatMul, BatchMatMul, and BatchMatMulV2 to this BatchMatMul. V2 does broadcasting. Curious which implementation we actually have. @annxingyuan

Regarding Yannick's question - we are only transposing, not returning the complex conjugate. Conjugation only matters for complex inputs, but the WebGL backend throws an error if the inputs are complex. So I think the name should stay 'transpose'.

Regarding Na's question - I don't think we support broadcasted matMul, but maybe I'm misinterpreting how BatchMatMulV2 implements broadcasting - @pyu10055 might know better.

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

Previously, annxingyuan (Ann Yuan) wrote…

Regarding Yannick's question - we are only transposing, not returning the complex conjugate. Conjugation only matters for complex inputs, but the WebGL backend throws an error if the inputs are complex. So I think the name should stay 'transpose'.

Regarding Na's question - I don't think we support broadcasted matMul, but maybe I'm misinterpreting how BatchMatMulV2 implements broadcasting - @pyu10055 might know better.

Added definition links:
MatMul: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/mat-mul
BatchMatMul: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/batch-mat-mul
BatchMatMul2: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/batch-mat-mul-v2

Converter mapping: https://github.com/tensorflow/tfjs/blob/master/tfjs-converter/src/operations/executors/matrices_executor.ts#L31-L38

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

Previously, lina128 (Na Li) wrote…

Added definition links:
MatMul: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/mat-mul
BatchMatMul: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/batch-mat-mul
BatchMatMul2: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/batch-mat-mul-v2

Converter mapping: https://github.com/tensorflow/tfjs/blob/master/tfjs-converter/src/operations/executors/matrices_executor.ts#L31-L38

Hi Ann,
Thank you for the answer! The documentation for V2 specifies: BatchMatMulV2 supports broadcasting in the batch dimensions. Otherwise, the description is the same as BatchMatMul. I think if we don't support broadcasting, the converter should throw an 'unsupported op' error? Otherwise the input may receive different shape that we don't handle, is that right? @annxingyuan @pyu10055

@lina128 lina128 requested a review from pyu10055 May 8, 2020 17:32
Copy link
Contributor

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

Previously, lina128 (Na Li) wrote…

Hi Ann,
Thank you for the answer! The documentation for V2 specifies: BatchMatMulV2 supports broadcasting in the batch dimensions. Otherwise, the description is the same as BatchMatMul. I think if we don't support broadcasting, the converter should throw an 'unsupported op' error? Otherwise the input may receive different shape that we don't handle, is that right? @annxingyuan @pyu10055

Hi Na, thanks for the additional detail. We do throw an error within matMul if the batch dimensions don't match - so we are at least handling it at some level.

I don't think we want to drop support for BatchMatMulV2 (if that's what you're suggesting) since many instances will not require broadcasting. Maybe @pyu10055 has thoughts on whether we should add more error handling up the stack.

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-core/src/kernel_names.ts, line 33 at r1 (raw file):

Previously, annxingyuan (Ann Yuan) wrote…

Hi Na, thanks for the additional detail. We do throw an error within matMul if the batch dimensions don't match - so we are at least handling it at some level.

I don't think we want to drop support for BatchMatMulV2 (if that's what you're suggesting) since many instances will not require broadcasting. Maybe @pyu10055 has thoughts on whether we should add more error handling up the stack.

I just feel that we should add broadcasting to implement BatchMatMulV2 properly. I'm ok with it as is too. Thank you Ann.

@lina128 lina128 merged commit b98f510 into tensorflow:master May 11, 2020
@lina128 lina128 deleted the temp3 branch May 11, 2020 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants