Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix conjugate-transpose for matrices of certain sizes (issue #19200) #46736

Merged
merged 4 commits into from Mar 4, 2021

Conversation

OmriSteiner
Copy link
Contributor

When using tf.linalg.matrix_transpose or tf.linalg.adjoint to conjugate-transpose a complex tensor, the functions might fail to conjugate the tensor under certain circumstances.

Given a complex matrix with m rows and n columns, if the following conditions are met:

  1. The operation runs on GPU.
  2. (m > 96 and n < 16) or (n > 96 and n < 16).
    Then the matrix will not be conjugated (although it will be transposed).

Code snippet which reproduces the bug (before patch):

import tensorflow as tf

with tf.device("GPU"):
    A = tf.complex(0., tf.ones(shape=(97,2)))
    Ah = tf.linalg.matrix_transpose(A, conjugate=True)

A[0][0] == Ah[0][0] ### True - but should be False

@google-ml-butler google-ml-butler bot added the size:M CL Change Size: Medium label Jan 27, 2021
@google-cla
Copy link

google-cla bot commented Jan 27, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Jan 27, 2021
@OmriSteiner
Copy link
Contributor Author

@googlebot I signed it!

@google-cla
Copy link

google-cla bot commented Jan 27, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@gbaned gbaned self-assigned this Jan 28, 2021
@gbaned gbaned added the comp:core issues related to core part of tensorflow label Jan 28, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jan 28, 2021
@gbaned
Copy link
Contributor

gbaned commented Jan 28, 2021

@OmriSteiner Can you please sign CLA. Thanks!

@OmriSteiner
Copy link
Contributor Author

@gbaned My employer signed the CLA and added me as an authorized contributor, but apparently it takes several days for approval?

@OmriSteiner
Copy link
Contributor Author

@googlebot I signed it!

@google-cla google-cla bot added cla: yes and removed cla: no labels Jan 28, 2021
@gbaned gbaned added the awaiting review Pull request awaiting review label Feb 5, 2021
@gbaned gbaned requested a review from hyeygit February 16, 2021 16:26
Copy link
Member

@allenlavoie allenlavoie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -890,7 +899,7 @@ struct TransposeElemType<4> {
};
template <>
struct TransposeElemType<8> {
using type = uint64;
using type = float2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary for correctness? Would we run into similar issues with complex<float16> and TransposeElemType<4>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is necessary. Instead of simply conjugating, we call maybe_conj, which is specialized for complex types in GPU (float2 and double2).

I thought about TransposeElemType<4> when fixing this, and came to the conclusion that tensorflow doesn't have a complex type which uses float16. std::complex<float16> doesn't exist, as far as I can tell.

To be honest, I'm not entirely sure why TransposeElemType is there in the first place. If I had to make a guess, I would say that it's in order to make this code work for non-basic types (i.e not tensorflow builtin numeric types). However, this is only used in a very specific place, so it would fail to achieve that.

Thanks for the comment, I found out I didn't fix the issue for complex128, but it's sorted now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, we just have DT_COMPLEX64 and DT_COMPLEX128. Sounds OK then. If you want to get rid of TransposeElemType that's fine too (we might find a failing test).

I do think it's another good reason to have a unit test. If someone comes along and adds a smaller complex type it'd make it at least possible to find and fix this issue.

@OmriSteiner
Copy link
Contributor Author

Yeah, I could write a unit-test for this.

@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Feb 25, 2021
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Feb 26, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 26, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 26, 2021
@google-ml-butler google-ml-butler bot removed the ready to pull PR ready for merge process label Feb 28, 2021
@OmriSteiner
Copy link
Contributor Author

@allenlavoie I've expanded the test to also test complex128.

@gbaned gbaned requested a review from allenlavoie March 1, 2021 03:42
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Mar 1, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 1, 2021
@gbaned gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Mar 4, 2021
@copybara-service copybara-service bot merged commit 3702573 into tensorflow:master Mar 4, 2021
PR Queue automation moved this from Approved by Reviewer to Merged Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:core issues related to core part of tensorflow ready to pull PR ready for merge process size:M CL Change Size: Medium
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

5 participants