Add torch.eig complex forward (CPU, CUDA) #49168

antocuni · 2020-12-10T16:03:15Z

Related to issue #42666

dr-ci · 2020-12-10T17:26:34Z

💊 CI failures summary and remediations

As of commit e1c1ebe (more details on the Dr. CI page):

1/6 failures introduced in this PR
5/6 broken upstream at merge base 89b4899 from Dec 23 until Dec 24

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_bionic_py3_8_gcc9_coverage_test1 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Dec 24 20:00:45 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future

Dec 24 20:00:45 At:
Dec 24 20:00:45   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 24 20:00:45   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 24 20:00:45 
Dec 24 20:00:45 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Dec 24 20:00:45 
Dec 24 20:00:45 At:
Dec 24 20:00:45   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 24 20:00:45   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 24 20:00:45 
Dec 24 20:00:45 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Dec 24 20:00:45 
Dec 24 20:00:45 At:
Dec 24 20:00:45   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 24 20:00:45   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 24 20:00:45 
Dec 24 20:00:45 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker3: EOF: end of file (this is expected to happen during shutdown)
Dec 24 20:00:45 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker2: EOF: end of file (this is expected to happen during shutdown)
Dec 24 20:00:45 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker1: EOF: end of file (this is expected to happen during shutdown)
Dec 24 20:00:46 ok (2.346s)
Dec 24 20:00:47   test_return_future_remote (__main__.TensorPipeRpcTestWithSpawn) ... [W tensorpipe_agent.cpp:547] RPC agent for worker2 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown)

1 job timed out:

pytorch_linux_bionic_py3_8_gcc9_coverage_test1

🚧 5 fixed upstream failures:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 from Dec 23 until Dec 24 (89b4899 - 46cf6d3)
- 🔁 rerun
pytorch_linux_xenial_py3_clang5_asan_test2 from Dec 23 until Dec 24 (89b4899 - 46cf6d3)
- 🔁 rerun
pytorch_linux_bionic_py3_6_clang9_test from Dec 23 until Dec 24 (89b4899 - 46cf6d3)
- 🔁 rerun
pytorch_linux_bionic_py3_8_gcc9_coverage_test2 from Dec 23 until Dec 24 (89b4899 - 46cf6d3)
- 🔁 rerun
pytorch_linux_xenial_py3_6_gcc5_4_test from Dec 23 until Dec 24 (89b4899 - 46cf6d3)
- 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 19 times.

… wi separately, we pass only w and compute the two sub-arrays later, and add a (so far unused) rwork argument

…around until the code compiles again. Add empty implementations for lapackEig on complex types

…to cgeev_; test_eig_basic stil fails though

…the one used by lapackEig

…better name for real_dtype to make it clearer

antocuni · 2020-12-24T21:32:29Z

it seems that all the failing tests also fail upstream, so I think that this PR is ready for review

mruberry · 2020-12-30T08:01:16Z

Test failures are unrelated. + @anjali411 to review, too.

This looks correct to me.

nikitaved · 2021-01-13T18:33:26Z

Is there any progress on this? Thinking about rewriting the eig_backward with complex input support (alas, the real case with complex numbers is still desired).

mruberry · 2021-01-13T19:35:59Z

Thanks for the ping, @nikitaved, looks like this got lost over the holidays. I'll take a note to schedule it.

anjali411 · 2021-01-13T20:38:22Z

aten/src/ATen/native/BatchLinearAlgebraKernel.cpp

+
+  // the API is slightly different for the complex vs real case: if the input
+  // is complex, eigenvals will be a vector of complex. If the input is real,
+  // eigenvals will be a (n, 2) matrix containing the real and imaginary parts


As discussed in #43081, I think we should always return a complex tensor. This will be a bc breaking change, so we should only update this behavior for torch.linalg.eig

I thought that the point of torch.linalg.* was to be as compatible to numpy as possible. This would be an unnecessary breakage for people porting their code from numpy to pytorch.
What about adding a flag such as returns_complex=False (or =True if we want the correct-but-numpy-incompatible behavior by default) to let the user choose?

Hey, @antocuni , looks like Numpy is already doing it:

In [18]: x = numpy.array([[1, 2], [-2, 1]], numpy.double) In [19]: x Out[19]: array([[ 1., 2.], [-2., 1.]]) In [20]: numpy.linalg.eig(x) Out[20]: (array([1.+2.j, 1.-2.j]), array([[0. -0.70710678j, 0. +0.70710678j], [0.70710678+0.j , 0.70710678-0.j ]]))

ouch, my fault, I overlooked it. Ok, I'll implement the "proper" behavior then, thanks for pointing it out

@anjali411 @nikitaved I just realized that torch.linalg.eig doesn't exists yet!
So I think that the current behavior of my branch is correct. I agree that torch.linalg.eig should always return complex, but I don't think there is anything we can do in this branch

aten/src/ATen/native/BatchLinearAlgebraKernel.cpp

antocuni · 2021-01-19T13:32:01Z

@anjali411 @mruberry:

I addressed the issue about using self.is_complex() (thank you for the suggestion, it's much better now!).
I think that the issue about the future behaviour of torch.linalg.eig is not relevant to this PR.
I merged upstream/master into this branch to make sure the branch still work. It seems it didn't introduce any new failure.

I think the branch is ready for a final round of review and hopefully merging

mruberry

Cool! Would you just add a test that this throws a runtime error when trying to do complex backward through it, @antocuni? Then just ping me and we'll get this merged.

Adding these functions makes a lot of sense since I imagine we'll reuse them for torch.linalg.eig, but maybe we should implement torch.linalg.eig next and not bother implementing complex backward for torch.eig?

codecov · 2021-01-19T13:54:20Z

Codecov Report

Merging #49168 (01c168d) into master (c458558) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #49168      +/-   ##
==========================================
- Coverage   80.64%   80.64%   -0.01%     
==========================================
  Files        1913     1913              
  Lines      208061   208077      +16     
==========================================
+ Hits       167797   167807      +10     
- Misses      40264    40270       +6

antocuni · 2021-01-19T14:43:21Z

Cool! Would you just add a test that this throws a runtime error when trying to do complex backward through it, @antocuni? Then just ping me and we'll get this merged.

@mruberry done by commit 01c168d

Adding these functions makes a lot of sense since I imagine we'll reuse them for torch.linalg.eig, but maybe we should implement torch.linalg.eig next and not bother implementing complex backward for torch.eig?

It's very likely that we will be able to implement torch.linalg.eig in terms of torch.eig (or vice versa) using dispatch: Math as we did for svd and qr: in that case, once we implement complex backward for one, we should get the other "for free".
So, the order in which we implement torch.eig's backward and torch.linalg.eig should be irrelevant.

I think that @nikitaved planned to work on complex backward for torch.eig, so maybe he has some opinion on this.

mruberry · 2021-01-19T15:22:15Z

It's very likely that we will be able to implement torch.linalg.eig in terms of torch.eig (or vice versa) using dispatch: Math as we did for svd and qr: in that case, once we implement complex backward for one, we should get the other "for free".
So, the order in which we implement torch.eig's backward and torch.linalg.eig should be irrelevant.

That's probably true. Whatever @nikitaved thinks best.

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

nikitaved · 2021-01-19T15:29:05Z

I agree, the order does not matter, the backward for real inputs with complex eigenvalues is not implemented anyway, so we will need to just insert a check for dtype or shape (if the tensor representing eigenvalues is real but has real and complex parts).

mruberry · 2021-01-22T20:58:28Z

Unfortunately it looks like this is hitting some internal errors, like this:

clahqr.c:function clahqr_: error: undefined reference to 'c_sqrt'
clahqr.c:function clahqr_: error: undefined reference to 'c_sqrt'
claqr0.c:function claqr0_: error: undefined reference to 'c_sqrt'

cc @IvanYashchuk and @ngimel

I have to run but can take a closer look later, too.

IvanYashchuk · 2021-01-23T08:22:03Z

From the filename, my guess is that CLAPACK is used somewhere and it is missing -lm flag when compiling.

mruberry · 2021-01-25T14:55:58Z

From the filename, my guess is that CLAPACK is used somewhere and it is missing -lm flag when compiling.

Good guess:

clapack/clapack/SRC/clahqr.c:448: error: undefined reference to 'c_sqrt'

mruberry · 2021-01-25T18:35:04Z

Update: @malfet has identified the issue. Some builds were excluding c_sqrt because of issues compiling it on Windows. He's testing a fix now.

facebook-github-bot · 2021-01-26T05:30:32Z

@mruberry merged this pull request in 880f007.

mruberry · 2021-01-26T10:05:41Z

The attribution above is erroneous, @malfet fixed and merged this PR by fixing Windows builds using CLAPACK so they could built c_sqrt properly, then fixing the build rules for those builds so they included it again.

enable the first eig/complex test, which currently fails [ci skip]

4786d9b

facebook-github-bot added the cla signed label Dec 10, 2020

pytorchbot added the open source label Dec 10, 2020

antocuni mentioned this pull request Dec 10, 2020

torch.linalg in PyTorch 1.10 tracker #42666

Closed

4 tasks

antocuni added 10 commits December 11, 2020 16:42

prepare lapackEig to support complex types: instead of passing wr and…

3809df9

… wi separately, we pass only w and compute the two sub-arrays later, and add a (so far unused) rwork argument

progress: dispatch apply_eig also to complex types, and tweak things …

a0034e8

…around until the code compiles again. Add empty implementations for lapackEig on complex types

WIP: implement lapackEig<complex<float>>, and the corresponding call …

5b2793d

…to cgeev_; test_eig_basic stil fails though

make sure that the shape of eigenvals is correct in the complex case

e1daece

add support for complex128

ab8ac04

Merge remote-tracking branch 'upstream/master' into antocuni/eig-complex

c731ac0

start to add CUDA support: change the signature of magmaEig to match …

e909de6

…the one used by lapackEig

WIP, untested: implement magmaEig for complex types

07db9c6

add complext support for CUDA eig

5080f88

kill the comment as the code already does what the XXX wanted. Use a …

e1c1ebe

…better name for real_dtype to make it clearer

antocuni changed the title ~~WIP Add torch.eig complex forward (CPU, CUDA)~~ Add torch.eig complex forward (CPU, CUDA) Dec 24, 2020

antocuni marked this pull request as ready for review December 24, 2020 21:31

mruberry self-requested a review December 29, 2020 17:36

mruberry added module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Dec 29, 2020

mruberry requested a review from anjali411 December 30, 2020 08:00

anjali411 reviewed Jan 13, 2021

View reviewed changes

aten/src/ATen/native/BatchLinearAlgebraKernel.cpp Outdated Show resolved Hide resolved

antocuni added 2 commits January 18, 2021 13:41

this is a much simpler way to check whether the tensor is complex

7f4060d

Merge remote-tracking branch 'upstream/master' into antocuni/eig-complex

8f69e27

mruberry approved these changes Jan 19, 2021

View reviewed changes

add a test to check what happens with eig on complex types

01c168d

facebook-github-bot reviewed Jan 19, 2021

View reviewed changes

facebook-github-bot closed this in 880f007 Jan 26, 2021

facebook-github-bot added the Merged label Jan 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add torch.eig complex forward (CPU, CUDA) #49168

Add torch.eig complex forward (CPU, CUDA) #49168

antocuni commented Dec 10, 2020

dr-ci bot commented Dec 10, 2020 •

edited by facebook-github-bot

antocuni commented Dec 24, 2020

mruberry commented Dec 30, 2020

nikitaved commented Jan 13, 2021

mruberry commented Jan 13, 2021

anjali411 Jan 13, 2021

antocuni Jan 18, 2021

nikitaved Jan 18, 2021

antocuni Jan 18, 2021 •

edited

antocuni Jan 19, 2021

antocuni commented Jan 19, 2021

mruberry left a comment

codecov bot commented Jan 19, 2021 •

edited

antocuni commented Jan 19, 2021

mruberry commented Jan 19, 2021

facebook-github-bot left a comment

nikitaved commented Jan 19, 2021

mruberry commented Jan 22, 2021

IvanYashchuk commented Jan 23, 2021

mruberry commented Jan 25, 2021

mruberry commented Jan 25, 2021

facebook-github-bot commented Jan 26, 2021

mruberry commented Jan 26, 2021

Add torch.eig complex forward (CPU, CUDA) #49168

Add torch.eig complex forward (CPU, CUDA) #49168

Conversation

antocuni commented Dec 10, 2020

dr-ci bot commented Dec 10, 2020 • edited by facebook-github-bot

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_bionic_py3_8_gcc9_coverage_test1 (1/1)

🚧 5 fixed upstream failures:

antocuni commented Dec 24, 2020

mruberry commented Dec 30, 2020

nikitaved commented Jan 13, 2021

mruberry commented Jan 13, 2021

anjali411 Jan 13, 2021

Choose a reason for hiding this comment

antocuni Jan 18, 2021

Choose a reason for hiding this comment

nikitaved Jan 18, 2021

Choose a reason for hiding this comment

antocuni Jan 18, 2021 • edited

Choose a reason for hiding this comment

antocuni Jan 19, 2021

Choose a reason for hiding this comment

antocuni commented Jan 19, 2021

mruberry left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 19, 2021 • edited

Codecov Report

antocuni commented Jan 19, 2021

mruberry commented Jan 19, 2021

facebook-github-bot left a comment

Choose a reason for hiding this comment

nikitaved commented Jan 19, 2021

mruberry commented Jan 22, 2021

IvanYashchuk commented Jan 23, 2021

mruberry commented Jan 25, 2021

mruberry commented Jan 25, 2021

facebook-github-bot commented Jan 26, 2021

mruberry commented Jan 26, 2021

dr-ci bot commented Dec 10, 2020 •

edited by facebook-github-bot

antocuni Jan 18, 2021 •

edited

codecov bot commented Jan 19, 2021 •

edited