Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] torch.hypot #22764

Closed
vadimkantorov opened this issue Jul 11, 2019 · 9 comments
Closed

[feature request] torch.hypot #22764

vadimkantorov opened this issue Jul 11, 2019 · 9 comments
Labels
enhancement Not as big of a feature, but technically not a bug. Should be easy to fix module: numpy Related to numpy support, and also numpy compatibility of our operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@vadimkantorov
Copy link
Contributor

vadimkantorov commented Jul 11, 2019

torch.stft returns real and imaginary components as the last dimension. Two very frequent functions to consume it are: abs and angle (in NumPy). Abs is similar to regular np.hypot, so meanwhile complex tensor support is not developed, hypot could do abs's job.

NumPy also has a hypot, although it accepts two arrays. We may have two versions: one accepting two arrays, another accepting only one array and a dim. Essentially, hypot is a version of norm, specialized for two element vectors only.

@jerryzh168 jerryzh168 added enhancement Not as big of a feature, but technically not a bug. Should be easy to fix module: numpy Related to numpy support, and also numpy compatibility of our operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jul 13, 2019
@vadimkantorov
Copy link
Contributor Author

This could be quite fast for real/imaginary being contiguous in memory (for impl one may or may not use the precision-enhancement as found in existing hypot implementations, maybe not such a big deal)

@vadimkantorov vadimkantorov changed the title [feature request] torch.hypot / torch.abs for a complex type [feature request] torch.hypot Oct 7, 2019
@vadimkantorov
Copy link
Contributor Author

vadimkantorov commented Feb 14, 2020

Probably the same kernel could be used for complex abs and hypot

@vadimkantorov
Copy link
Contributor Author

vadimkantorov commented Aug 12, 2020

@muthuArivoli It would be interesting to know perf for:

X = torch.rand(128, 128, 2)
A = X.norm(dim = -1)
B = torch.hypot(X[..., 0], X[..., 1])
C = X.view_as_complex().abs()
D = torch.add(*(X ** 2).unbind(dim = -1)).sqrt()

@muthuArivoli
Copy link
Contributor

@vadimkantorov Using the script

import torch
import time

X = torch.rand(2**7, 2**7, 2)
start = time.time()
A = X.norm(dim = -1)
end = time.time()
print(end-start)

start = time.time()
B = torch.hypot(X[..., 0], X[..., 1])
end = time.time()
print(end-start)

start = time.time()
C = torch.view_as_complex(X).abs()
end = time.time()
print(end-start)

start = time.time()
D = torch.add(*(X ** 2).unbind(dim = -1)).sqrt()
end = time.time()
print(end-start)

I got consistent results around

0.0018534660339355469
0.0002155303955078125
0.0001697540283203125
0.00030803680419921875

When I increased the size of the tensor to something large, X = torch.rand(2**14, 2**14, 2), I got results consistently around:

25.421497344970703
0.5420243740081787
0.6201455593109131
0.7890973091125488

These tests were done on a core i7-6770HQ. I don't have a CUDA enabled computer with me, so I won't be able to benchmark on GPU.
Hope this helps! Let me know if I should benchmark this in a different way or if you need anything else.

@vadimkantorov
Copy link
Contributor Author

vadimkantorov commented Aug 13, 2020

@mruberry What is the proper way of measuring CPU perf? I'm having troubles to compute wall-clock time in #42959

If torch.hypot is indeed better than other methods (and definitely better than norm), maybe they all could delegate to hypot?

I also wonder if hypot makes use of chunked/consecutive data loading for the case when first and second arguments are actually interleaved in memory as in this example @colesbury

@vadimkantorov
Copy link
Contributor Author

@muthuArivoli The large number for norm may actually be the same reason. Could you please try torch.set_num_threads(1)? If it decreases, it's related to multiple threads times summing.

@mruberry
Copy link
Collaborator

@ngimel has a reference for how we benchmark, I think

@muthuArivoli
Copy link
Contributor

@muthuArivoli The large number for norm may actually be the same reason. Could you please try torch.set_num_threads(1)? If it decreases, it's related to multiple threads times summing.

@vadimkantorov With input X = torch.rand(2**7, 2**7, 2), I got

0.0036995410919189453
0.00023674964904785156
0.00020194053649902344
0.00036525726318359375

With large input X = torch.rand(2**14, 2**14, 2), I got

55.82693123817444
2.1235761642456055
1.628692388534546
2.3243396282196045

@vadimkantorov
Copy link
Contributor Author

@ngimel shouldn't norm then delegate to some other methods for super small dimensions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Not as big of a feature, but technically not a bug. Should be easy to fix module: numpy Related to numpy support, and also numpy compatibility of our operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants