Roll-up: remaining TH functions #49421

ngimel · 2020-12-15T19:48:46Z

nikitaved · 2021-01-13T21:34:04Z

I would take index_fill_. Maybe it will fix #48230.

Regarding gels, there is torch.linalg.lstsq on its way. What would be the best way to handle it's port?

ngimel · 2021-01-13T21:48:14Z

Regarding gels, there is torch.linalg.lstsq on its way. What would be the best way to handle it's port?

First port gels to ATen (similarly to how it's done for cholesky_inverse), then expose as torch.linalg.lstsq?

nikitaved · 2021-01-13T21:59:02Z

torch.linalg.lstsq is ready #49093. Afaik gels is only used by torch.lstsq. But calling torch.linalg from within torch is not a good idea, right? What about deprecation or rewrapping?

mruberry · 2021-01-19T08:21:11Z

torch.linalg.lstsq is ready #49093. Afaik gels is only used by torch.lstsq. But calling torch.linalg from within torch is not a good idea, right? What about deprecation or rewrapping?

Calling torch.linaln from the torch namespace seems fine. We should look at deprecating lstsq once torch.linalg.lstsq is implemented and has feature parity, and then we should consider it as part of our "doc deprecations" for linalg ahead of the 1.8 release.

Summary: Now we can remove `_th_orgqr`! Compared to the original TH-based `orgqr`, complex (#33152) and batched inputs are now supported. CUDA support will be added in a follow-up PR. Closes #24747 Ref. #49421, #42666 Pull Request resolved: #50502 Reviewed By: mrshenli Differential Revision: D25953300 Pulled By: mruberry fbshipit-source-id: f52a74e1c8f51b5e24f7b461430ca8fc96e4d149

Summary: Now we can remove `_th_potri`! Compared to the original TH-based `cholesky_inverse`, complex (#33152) and batched inputs (#7500) are now supported both on CPU and CUDA. Closes #24685. Closes #24543. Ref. #49421, #42666 Pull Request resolved: #50269 Reviewed By: bdhirsh Differential Revision: D26047548 Pulled By: anjali411 fbshipit-source-id: e4f191e39c684f241b7cb0f4b4c025de082cccef

ngimel · 2021-05-06T19:27:17Z

cc @heitorschueroff for _th_var, _th_std. There are implementations currently in ATen, but, as all non-vectorized reductions, they are much slower, so the attempt to directly swap implementations caused severe perf regressions.

Summary: Ref #49421 This migrates `std`/`var`'s special case all-reduction from TH to ATen. Using the benchmark from gh-43858 that was used to justify keeping the TH version; I find this PR has similar (slightly better) performance in single threaded. And unlike the TH version, this is multi-threaded and so much faster for large tensors. TH Results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 3.6 | 3.8 | 8.2 | 1.2 80 | 3.7 | 3.8 | 8.4 | 1.2 800 | 4.2 | 4.3 | 8.7 | 1.2 8000 | 9.0 | 9.1 | 11.2 | 1.5 80000 | 58.3 | 59.0 | 30.6 | 4.2 800000 | 546.9 | 546.9 | 183.4 | 31.3 8000000 | 5729.7 | 5701.0 | 6165.4 | 484.1 ``` ATen results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 4.0 | 4.0 | 8.7 | 1.2 80 | 3.6 | 3.8 | 9.0 | 1.2 800 | 4.1 | 4.3 | 8.9 | 1.2 8000 | 8.9 | 9.2 | 10.6 | 1.5 80000 | 57.0 | 57.4 | 28.8 | 4.3 800000 | 526.9 | 526.9 | 178.3 | 30.2 8000000 | 5568.1 | 5560.6 | 6042.1 | 453.2 [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 8 threads: --------------------------------------------------------- 8 | 3.9 | 3.8 | 9.1 | 1.2 80 | 3.8 | 3.9 | 8.8 | 1.2 800 | 4.2 | 4.3 | 8.9 | 1.3 8000 | 9.0 | 9.2 | 10.4 | 1.5 80000 | 26.0 | 26.8 | 26.4 | 4.4 800000 | 92.9 | 87.3 | 72.1 | 22.4 8000000 | 793.5 | 791.8 | 5334.8 | 115.1 ``` Pull Request resolved: #59258 Reviewed By: mruberry Differential Revision: D28821216 Pulled By: ngimel fbshipit-source-id: f35992c21f08a0a8878053680dc0ca7a8facd155

Summary: Ref #49421 This migrates `std`/`var`'s special case all-reduction from TH to ATen. Using the benchmark from gh-43858 that was used to justify keeping the TH version; I find this PR has similar (slightly better) performance in single threaded. And unlike the TH version, this is multi-threaded and so much faster for large tensors. TH Results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 3.6 | 3.8 | 8.2 | 1.2 80 | 3.7 | 3.8 | 8.4 | 1.2 800 | 4.2 | 4.3 | 8.7 | 1.2 8000 | 9.0 | 9.1 | 11.2 | 1.5 80000 | 58.3 | 59.0 | 30.6 | 4.2 800000 | 546.9 | 546.9 | 183.4 | 31.3 8000000 | 5729.7 | 5701.0 | 6165.4 | 484.1 ``` ATen results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 4.0 | 4.0 | 8.7 | 1.2 80 | 3.6 | 3.8 | 9.0 | 1.2 800 | 4.1 | 4.3 | 8.9 | 1.2 8000 | 8.9 | 9.2 | 10.6 | 1.5 80000 | 57.0 | 57.4 | 28.8 | 4.3 800000 | 526.9 | 526.9 | 178.3 | 30.2 8000000 | 5568.1 | 5560.6 | 6042.1 | 453.2 [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 8 threads: --------------------------------------------------------- 8 | 3.9 | 3.8 | 9.1 | 1.2 80 | 3.8 | 3.9 | 8.8 | 1.2 800 | 4.2 | 4.3 | 8.9 | 1.3 8000 | 9.0 | 9.2 | 10.4 | 1.5 80000 | 26.0 | 26.8 | 26.4 | 4.4 800000 | 92.9 | 87.3 | 72.1 | 22.4 8000000 | 793.5 | 791.8 | 5334.8 | 115.1 ``` Pull Request resolved: #59258 Reviewed By: jbschlosser Differential Revision: D28860353 Pulled By: ngimel fbshipit-source-id: 80c9fe1db84dbc864eeb1a319076c7aaff0a04e5

Summary: Ref pytorch#49421 This migrates `std`/`var`'s special case all-reduction from TH to ATen. Using the benchmark from pytorchgh-43858 that was used to justify keeping the TH version; I find this PR has similar (slightly better) performance in single threaded. And unlike the TH version, this is multi-threaded and so much faster for large tensors. TH Results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 3.6 | 3.8 | 8.2 | 1.2 80 | 3.7 | 3.8 | 8.4 | 1.2 800 | 4.2 | 4.3 | 8.7 | 1.2 8000 | 9.0 | 9.1 | 11.2 | 1.5 80000 | 58.3 | 59.0 | 30.6 | 4.2 800000 | 546.9 | 546.9 | 183.4 | 31.3 8000000 | 5729.7 | 5701.0 | 6165.4 | 484.1 ``` ATen results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 4.0 | 4.0 | 8.7 | 1.2 80 | 3.6 | 3.8 | 9.0 | 1.2 800 | 4.1 | 4.3 | 8.9 | 1.2 8000 | 8.9 | 9.2 | 10.6 | 1.5 80000 | 57.0 | 57.4 | 28.8 | 4.3 800000 | 526.9 | 526.9 | 178.3 | 30.2 8000000 | 5568.1 | 5560.6 | 6042.1 | 453.2 [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 8 threads: --------------------------------------------------------- 8 | 3.9 | 3.8 | 9.1 | 1.2 80 | 3.8 | 3.9 | 8.8 | 1.2 800 | 4.2 | 4.3 | 8.9 | 1.3 8000 | 9.0 | 9.2 | 10.4 | 1.5 80000 | 26.0 | 26.8 | 26.4 | 4.4 800000 | 92.9 | 87.3 | 72.1 | 22.4 8000000 | 793.5 | 791.8 | 5334.8 | 115.1 ``` Pull Request resolved: pytorch#59258 Reviewed By: mruberry Differential Revision: D28821216 Pulled By: ngimel fbshipit-source-id: f35992c21f08a0a8878053680dc0ca7a8facd155

Summary: Ref pytorch#49421 This migrates `std`/`var`'s special case all-reduction from TH to ATen. Using the benchmark from pytorchgh-43858 that was used to justify keeping the TH version; I find this PR has similar (slightly better) performance in single threaded. And unlike the TH version, this is multi-threaded and so much faster for large tensors. TH Results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 3.6 | 3.8 | 8.2 | 1.2 80 | 3.7 | 3.8 | 8.4 | 1.2 800 | 4.2 | 4.3 | 8.7 | 1.2 8000 | 9.0 | 9.1 | 11.2 | 1.5 80000 | 58.3 | 59.0 | 30.6 | 4.2 800000 | 546.9 | 546.9 | 183.4 | 31.3 8000000 | 5729.7 | 5701.0 | 6165.4 | 484.1 ``` ATen results: ``` [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 1 threads: --------------------------------------------------------- 8 | 4.0 | 4.0 | 8.7 | 1.2 80 | 3.6 | 3.8 | 9.0 | 1.2 800 | 4.1 | 4.3 | 8.9 | 1.2 8000 | 8.9 | 9.2 | 10.6 | 1.5 80000 | 57.0 | 57.4 | 28.8 | 4.3 800000 | 526.9 | 526.9 | 178.3 | 30.2 8000000 | 5568.1 | 5560.6 | 6042.1 | 453.2 [----------------------------- Index ------------------------------] | torch_var | torch_var0 | stdfn | torch_sum0 8 threads: --------------------------------------------------------- 8 | 3.9 | 3.8 | 9.1 | 1.2 80 | 3.8 | 3.9 | 8.8 | 1.2 800 | 4.2 | 4.3 | 8.9 | 1.3 8000 | 9.0 | 9.2 | 10.4 | 1.5 80000 | 26.0 | 26.8 | 26.4 | 4.4 800000 | 92.9 | 87.3 | 72.1 | 22.4 8000000 | 793.5 | 791.8 | 5334.8 | 115.1 ``` Pull Request resolved: pytorch#59258 Reviewed By: jbschlosser Differential Revision: D28860353 Pulled By: ngimel fbshipit-source-id: 80c9fe1db84dbc864eeb1a319076c7aaff0a04e5

peterbell10 · 2021-06-24T18:37:15Z

LegacyFunctionsCPU is now gone, as of #58780.

ngimel added module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul better-engineering Relatively self-contained tasks for better engineering contributors labels Dec 15, 2020

mruberry added the module: porting Issues related to porting TH/THNN legacy to ATen native label Dec 15, 2020

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 15, 2020

mruberry mentioned this issue Dec 16, 2020

torch.linalg in PyTorch 1.10 tracker #42666

Closed

4 tasks

This was referenced Jan 8, 2021

Port cholesky_inverse to ATen #50269

Closed

Port CPU torch.orgqr to ATen #50502

Closed

nikitaved mentioned this issue Jan 13, 2021

kill multinomial_alias_setup/draw #50489

Closed

facebook-github-bot closed this as completed in c458558 Jan 19, 2021

nikitaved reopened this Jan 19, 2021

peterbell10 self-assigned this Jun 1, 2021

peterbell10 mentioned this issue Jun 1, 2021

Migrate _th_std_var to ATen #59258

Closed

peterbell10 closed this as completed Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roll-up: remaining TH functions #49421

Roll-up: remaining TH functions #49421

ngimel commented Dec 15, 2020 •

edited by peterbell10

nikitaved commented Jan 13, 2021 •

edited

ngimel commented Jan 13, 2021

nikitaved commented Jan 13, 2021 •

edited

mruberry commented Jan 19, 2021

ngimel commented May 6, 2021

peterbell10 commented Jun 24, 2021

Roll-up: remaining TH functions #49421

Roll-up: remaining TH functions #49421

Comments

ngimel commented Dec 15, 2020 • edited by peterbell10

nikitaved commented Jan 13, 2021 • edited

ngimel commented Jan 13, 2021

nikitaved commented Jan 13, 2021 • edited

mruberry commented Jan 19, 2021

ngimel commented May 6, 2021

peterbell10 commented Jun 24, 2021

ngimel commented Dec 15, 2020 •

edited by peterbell10

nikitaved commented Jan 13, 2021 •

edited

nikitaved commented Jan 13, 2021 •

edited