Improve torch.fft n-dimensional transforms #46911

peterbell10 · 2020-10-27T12:11:58Z

Stack from ghstack:

Optimize torch.fft.fftn for real input with an r2c transform #46912 Optimize torch.fft.fftn for real input with an r2c transform
stft: Change require_complex warning to an error #49022 stft: Change require_complex warning to an error
Remove unused operator at::_fft_with_size #48905 Remove unused operator at::_fft_with_size
Use new FFT operators in stft #47601 Use new FFT operators in stft
Improve torch.fft n-dimensional transforms #46911 Improve torch.fft n-dimensional transforms

Differential Revision: D25420647

[ghstack-poisoned]

dr-ci · 2020-10-27T12:38:30Z

💊 CI failures summary and remediations

As of commit e42964b (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 80 times.

peterbell10 · 2020-10-27T12:50:56Z

This replaces use of _fft_with_size in torch.fft with three new operators: _fft_c2c, _fft_r2c, and _fft_c2r. These are separated so you can't pass invalid parameter combinations (e.g. complex_input=True, complex_output=True is possible with _fft_with_size but is meaningless).

The new operators can transform arbitrary dims and also almost-never make an extra tensor copy because the backends have been reworked to make fewer assumptions about the format of the input. They also work directly with complex dtypes instead of going through view_as_real as a prior step.

I can't remove _fft_with_size yet because (as far as I know) torch.complex32 isn't fully working yet, so the real version is the only way to access cuFFT's half-precision FFTs.

[ghstack-poisoned]

facebook-github-bot · 2020-10-30T17:29:55Z

Hi @peterbell10!

Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but we do not have a signature on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

mruberry · 2020-11-02T14:43:09Z

I can't remove _fft_with_size yet because (as far as I know) torch.complex32 isn't fully working yet, so the real version is the only way to access cuFFT's half-precision FFTs.

Do we really support torch.complex32, though? If it's simpler not to then I'd rather we not bother.

mruberry · 2020-11-02T14:46:05Z

aten/src/ATen/native/SpectralOps.cpp

@@ -370,44 +300,36 @@ Tensor fft_ifftn(const Tensor& self, c10::optional<IntArrayRef> s,

 Tensor fft_rfftn(const Tensor& self, c10::optional<IntArrayRef> s,
                c10::optional<IntArrayRef> dim,
-                c10::optional<std::string> norm) {
+                c10::optional<std::string> norm_str) {
+  TORCH_CHECK(!self.is_complex(), "Expected a real input tensor to rfftn");


"Expected a real input tensor to rfftn" -> "rfftn expects a real-valued input tensor, but got {dtype}!"

aten/src/ATen/native/cuda/SpectralOps.cu

peterbell10 · 2020-11-27T16:31:59Z

I'm happy with this PR, so I've marked it ready for review. The benchmark PR review seemed to stall but I think @robieta was okay with the benchmark itself and the discussion was more around where to put the files. So, I think the results can be trusted even though it's not merged yet.

I have however marked #46912 as draft and moved it to the end of the stack. I wasn't seeing a consistent improvement, and in some cases a huge regression in performance so might need to reconsider it. The other two PRs in this stack should be good though.

mruberry · 2020-11-27T23:11:02Z

I'm happy with this PR, so I've marked it ready for review. The benchmark PR review seemed to stall but I think @robieta was okay with the benchmark itself and the discussion was more around where to put the files. So, I think the results can be trusted even though it's not merged yet.

Sounds good. What did you decide to do about complex half support and _fft_with_size?

I have however marked #46912 as draft and moved it to the end of the stack. I wasn't seeing a consistent improvement, and in some cases a huge regression in performance so might need to reconsider it. The other two PRs in this stack should be good though.

OK. Let's discuss those after this PR. Would it be helpful to take a break from this stack and refocus on tasks preparing torch.fft for 1.8? That is, removing the torch.fft function and importing the torch.fft module by default (and updating the documentation to reflect this change). That should be a simple change but we should make it sooner rather than later to see if there are issues with the deprecation.

peterbell10 · 2020-11-28T00:54:36Z

What did you decide to do about complex half support and _fft_with_size?

I don't think it's blocking for this PR. I have no idea if it's widely used and it's certainly not needed for numpy/scipy compatibility. So, just dropping support might be okay. If not, then I think this code should actually call cuFFT correctly for complex half already, and the main issue for support is the other aten operators that get called during the FFT. For example rfft(x) works if I remove the TORCH_CHECK(dtype == kFloat || dtype == kDouble, ...) but rfftn(x) fails with

RuntimeError: "copy_" not implemented for 'ComplexHalf'

Would it be helpful to take a break from this stack and refocus on tasks preparing torch.fft for 1.8? That is, removing the torch.fft function and importing the torch.fft module by default (and updating the documentation to reflect this change). That should be a simple change but we should make it sooner rather than later to see if there are issues with the deprecation.

Sure thing, I can start work on that tomorrow.

mruberry · 2020-11-28T01:08:35Z

I don't think it's blocking for this PR. I have no idea if it's widely used and it's certainly not needed for numpy/scipy compatibility. So, just dropping support might be okay.

Let's protect ourselves by disabling any complex half support we currently have and simplifying the code appropriately. If we decided to support complex half in the future we should do a holistic review.

With complex half unsupported, would there be any changes you'd like to make in this PR still? Earlier you suggested that not supporting complex half would let this PR remove _fft_with_size?

peterbell10 · 2020-11-28T01:31:18Z

With complex half unsupported, would there be any changes you'd like to make in this PR still? Earlier you suggested that not supporting complex half would let this PR remove _fft_with_size?

Ignoring half issues, _fft_with_size could be replaced by the new operators. However, if we're good to remove torch.{fft,rfft,ifft,irfft} now then it makes more sense to remove _fft_with_size in the same PR, so _fft_with_size wouldn't have any callers to rewrite.

mruberry · 2020-11-28T01:35:01Z

Ignoring half issues, _fft_with_size could be replaced by the new operators. However, if we're good to remove torch.{fft,rfft,ifft,irfft} now then it makes more sense to remove _fft_with_size in the same PR, so _fft_with_size wouldn't have any callers to rewrite.

That makes sense. So you're saying let's keep this as-is but then in the "1.8 readiness" PR described above we remove these functions and _fft_with_size then? Sounds like a plan to me.

mruberry · 2020-12-02T06:51:26Z

ASAN failure looks real:

Nov 27 18:57:01   test_fft_backward_cpu_float64 (__main__.TestFFTCPU) ... /var/lib/jenkins/workspace/aten/src/ATen/native/mkl/SpectralOps.cpp:88:19: runtime error: addition of unsigned offset to 0x619001e07ca0 overflowed to 0x619001e07c90

[ghstack-poisoned]

peterbell10 · 2020-12-02T18:13:58Z

Very weird ASAN error since all the stride calculation there is done in signed specifically to allow negative stride offsets. Just added __ubsan_ignore_undefined__ and the false-positive is silenced. @mruberry PTAL.

mruberry · 2020-12-02T18:17:53Z

aten/src/ATen/native/mkl/SpectralOps.cpp

@@ -61,7 +73,7 @@ void _fft_fill_with_conjugate_symmetry_slice(
  // We explicitly loop over one row, then use this lambda to iterate over
  // n-dimensions. This advances iter_index by one row, while updating in_ptr
  // and out_ptr to point to the new row of data.
-  auto advance_index = [&] {
+  auto advance_index = [&] () __ubsan_ignore_undefined__ {


This shouldn't exhibit undefined behavior. Let's investigate some more. Maybe start by typing the loop variable correctly to iter_indexes size type. The loop is currently doing an unsigned vs signed comparison.

This shouldn't exhibit undefined behavior.

It doesn't, ubsan just doesn't seem to like this line:

pytorch/aten/src/ATen/native/mkl/SpectralOps.cpp

Line 88 in f9b5cd2

out_ptr += out_strides[i];

because out_strides can be negative. It seems to be treating the pointer as if it were unsigned for some reason.

Maybe start by typing the loop variable correctly to iter_indexes size type.

DimVector::size_type is just size_t which is already the type.

My mistake, I thought it was int64_t.

Either this is a false positive, or the the error message is just misleading. It specifically says unsigned offset, but IntArrayRef is int64_t so out_strides is definitely a signed offset.

Let's assume the error message is misleading, but there's a valid UBSAN issue.

Could this be complaining, for example, that

out_ptr += (signal_half_sizes[i] - 1) * out_strides[i];

is upsetting the sanitizer because it's applying a negative offset to out_ptr?

The message states that address 0x619001e07ca0 "overflows" to 0x619001e07c90. Let's just focus on those last two digits where these values are different. 0xa0 = 160, and 0x90 = 144. So Maybe the sanitizer thinks there was overflow because it's not expected a negative offset to be += to the pointer to arrive at a final offset of 144?

Yes, that's what I've been saying. The sanitizer doesn't seem to like that the calculated offset is negative. However, it's entirely expected to be negative because it's using negative strides. And as far as I'm aware, there's no undefined behaviour there.

[ghstack-poisoned]

mruberry · 2020-12-08T08:30:24Z

aten/src/ATen/native/mkl/SpectralOps.cpp

+  }
+
+  const auto value_type = c10::toValueType(input.scalar_type());
+  out.resize_(batched_out_sizes, MemoryFormat::Contiguous);


This is OK for now since the fft operations don't take out=, but in the future this will need to be updated to use resize_output().

This won't actually match the expected output shape though. I'm not sure the cufft backend could implement an out argument much better than just copying into it at the end.

Interesting. Luckily a problem for tomorrow ;)

mruberry

Thanks @peterbell10! Just one small nit on the formatting of a string. Let me know when this is ready to be merged.

[ghstack-poisoned]

peterbell10 · 2020-12-08T15:32:10Z

@mruberry the rfftn error message has been updated.

facebook-github-bot · 2020-12-09T21:20:12Z

@mruberry merged this pull request in fc0a3a1.

ghstack-source-id: 6da7a2d097a4b5b8dc1b4dd709b945f62302cd8f Pull Request resolved: pytorch#46911

Improve torch.fft n-dimensional transforms

b648bd6

[ghstack-poisoned]

peterbell10 requested review from albanD and apaszke as code owners October 27, 2020 12:11

pytorchbot added the open source label Oct 27, 2020

peterbell10 requested a review from mruberry October 27, 2020 12:21

Update on "Improve torch.fft n-dimensional transforms"

9da2b15

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

e6b0d74

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

a1e6ca6

[ghstack-poisoned]

peterbell10 added 2 commits October 29, 2020 21:53

Update on "Improve torch.fft n-dimensional transforms"

a6e36e5

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

fc080a5

[ghstack-poisoned]

peterbell10 added 2 commits October 29, 2020 23:06

Update on "Improve torch.fft n-dimensional transforms"

e0946ec

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

6cc5853

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

3d93379

[ghstack-poisoned]

facebook-github-bot added the cla signed label Oct 31, 2020

mruberry reviewed Nov 2, 2020

View reviewed changes

aten/src/ATen/native/cuda/SpectralOps.cu Show resolved Hide resolved

peterbell10 marked this pull request as ready for review November 27, 2020 16:25

peterbell10 added 2 commits December 2, 2020 13:28

Update on "Improve torch.fft n-dimensional transforms"

0efb859

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

f9b5cd2

[ghstack-poisoned]

mruberry reviewed Dec 2, 2020

View reviewed changes

Update on "Improve torch.fft n-dimensional transforms"

4f26762

[ghstack-poisoned]

peterbell10 mentioned this pull request Dec 6, 2020

Remove unused operator at::_fft_with_size #48905

Closed

mruberry reviewed Dec 8, 2020

View reviewed changes

mruberry self-requested a review December 8, 2020 08:32

mruberry approved these changes Dec 8, 2020

View reviewed changes

peterbell10 added 2 commits December 8, 2020 13:53

Update on "Improve torch.fft n-dimensional transforms"

5b6f748

[ghstack-poisoned]

Update on "Improve torch.fft n-dimensional transforms"

e42964b

[ghstack-poisoned]

peterbell10 mentioned this pull request Dec 8, 2020

stft: Change require_complex warning to an error #49022

Closed

facebook-github-bot closed this in fc0a3a1 Dec 9, 2020

facebook-github-bot added the Merged label Dec 9, 2020

facebook-github-bot deleted the gh/peterbell10/22/head branch December 13, 2020 15:17

peterbell10 added a commit to peterbell10/pytorch that referenced this pull request Dec 15, 2020

Improve torch.fft n-dimensional transforms

8d3c389

ghstack-source-id: 6da7a2d097a4b5b8dc1b4dd709b945f62302cd8f Pull Request resolved: pytorch#46911

peterbell10 mentioned this pull request Dec 16, 2020

torch.fft tracking issue #42175

Closed

37 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve torch.fft n-dimensional transforms #46911

Improve torch.fft n-dimensional transforms #46911

peterbell10 commented Oct 27, 2020 •

edited by mruberry

dr-ci bot commented Oct 27, 2020 •

edited

peterbell10 commented Oct 27, 2020

facebook-github-bot commented Oct 30, 2020

mruberry commented Nov 2, 2020

mruberry Nov 2, 2020

peterbell10 commented Nov 27, 2020

mruberry commented Nov 27, 2020

peterbell10 commented Nov 28, 2020

mruberry commented Nov 28, 2020

peterbell10 commented Nov 28, 2020

mruberry commented Nov 28, 2020

mruberry commented Dec 2, 2020

peterbell10 commented Dec 2, 2020

mruberry Dec 2, 2020

peterbell10 Dec 2, 2020

mruberry Dec 2, 2020

peterbell10 Dec 2, 2020

mruberry Dec 4, 2020

peterbell10 Dec 4, 2020

mruberry Dec 8, 2020

peterbell10 Dec 8, 2020

mruberry Dec 9, 2020

mruberry left a comment

peterbell10 commented Dec 8, 2020

facebook-github-bot commented Dec 9, 2020

Improve torch.fft n-dimensional transforms #46911

Improve torch.fft n-dimensional transforms #46911

Conversation

peterbell10 commented Oct 27, 2020 • edited by mruberry

dr-ci bot commented Oct 27, 2020 • edited

💊 CI failures summary and remediations

peterbell10 commented Oct 27, 2020

facebook-github-bot commented Oct 30, 2020

mruberry commented Nov 2, 2020

Choose a reason for hiding this comment

peterbell10 commented Nov 27, 2020

mruberry commented Nov 27, 2020

peterbell10 commented Nov 28, 2020

mruberry commented Nov 28, 2020

peterbell10 commented Nov 28, 2020

mruberry commented Nov 28, 2020

mruberry commented Dec 2, 2020

peterbell10 commented Dec 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mruberry left a comment

Choose a reason for hiding this comment

peterbell10 commented Dec 8, 2020

facebook-github-bot commented Dec 9, 2020

peterbell10 commented Oct 27, 2020 •

edited by mruberry

dr-ci bot commented Oct 27, 2020 •

edited