New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve torch.fft n-dimensional transforms #46911
Conversation
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit e42964b (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 80 times. |
This replaces use of The new operators can transform arbitrary dims and also almost-never make an extra tensor copy because the backends have been reworked to make fewer assumptions about the format of the input. They also work directly with complex dtypes instead of going through I can't remove |
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Hi @peterbell10! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but we do not have a signature on file. In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
Do we really support torch.complex32, though? If it's simpler not to then I'd rather we not bother. |
aten/src/ATen/native/SpectralOps.cpp
Outdated
@@ -370,44 +300,36 @@ Tensor fft_ifftn(const Tensor& self, c10::optional<IntArrayRef> s, | |||
|
|||
Tensor fft_rfftn(const Tensor& self, c10::optional<IntArrayRef> s, | |||
c10::optional<IntArrayRef> dim, | |||
c10::optional<std::string> norm) { | |||
c10::optional<std::string> norm_str) { | |||
TORCH_CHECK(!self.is_complex(), "Expected a real input tensor to rfftn"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Expected a real input tensor to rfftn" -> "rfftn expects a real-valued input tensor, but got {dtype}!"
I'm happy with this PR, so I've marked it ready for review. The benchmark PR review seemed to stall but I think @robieta was okay with the benchmark itself and the discussion was more around where to put the files. So, I think the results can be trusted even though it's not merged yet. I have however marked #46912 as draft and moved it to the end of the stack. I wasn't seeing a consistent improvement, and in some cases a huge regression in performance so might need to reconsider it. The other two PRs in this stack should be good though. |
Sounds good. What did you decide to do about complex half support and _fft_with_size?
OK. Let's discuss those after this PR. Would it be helpful to take a break from this stack and refocus on tasks preparing torch.fft for 1.8? That is, removing the torch.fft function and importing the torch.fft module by default (and updating the documentation to reflect this change). That should be a simple change but we should make it sooner rather than later to see if there are issues with the deprecation. |
I don't think it's blocking for this PR. I have no idea if it's widely used and it's certainly not needed for numpy/scipy compatibility. So, just dropping support might be okay. If not, then I think this code should actually call cuFFT correctly for complex half already, and the main issue for support is the other aten operators that get called during the FFT. For example
Sure thing, I can start work on that tomorrow. |
Let's protect ourselves by disabling any complex half support we currently have and simplifying the code appropriately. If we decided to support complex half in the future we should do a holistic review. With complex half unsupported, would there be any changes you'd like to make in this PR still? Earlier you suggested that not supporting complex half would let this PR remove |
Ignoring half issues, |
That makes sense. So you're saying let's keep this as-is but then in the "1.8 readiness" PR described above we remove these functions and |
ASAN failure looks real:
|
[ghstack-poisoned]
[ghstack-poisoned]
Very weird ASAN error since all the stride calculation there is done in signed specifically to allow negative stride offsets. Just added |
@@ -61,7 +73,7 @@ void _fft_fill_with_conjugate_symmetry_slice( | |||
// We explicitly loop over one row, then use this lambda to iterate over | |||
// n-dimensions. This advances iter_index by one row, while updating in_ptr | |||
// and out_ptr to point to the new row of data. | |||
auto advance_index = [&] { | |||
auto advance_index = [&] () __ubsan_ignore_undefined__ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't exhibit undefined behavior. Let's investigate some more. Maybe start by typing the loop variable correctly to iter_indexes
size type. The loop is currently doing an unsigned vs signed comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't exhibit undefined behavior.
It doesn't, ubsan just doesn't seem to like this line:
out_ptr += out_strides[i]; |
because out_strides
can be negative. It seems to be treating the pointer as if it were unsigned for some reason.
Maybe start by typing the loop variable correctly to
iter_indexes
size type.
DimVector::size_type
is just size_t
which is already the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I thought it was int64_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either this is a false positive, or the the error message is just misleading. It specifically says unsigned offset, but IntArrayRef
is int64_t
so out_strides
is definitely a signed offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assume the error message is misleading, but there's a valid UBSAN issue.
Could this be complaining, for example, that
out_ptr += (signal_half_sizes[i] - 1) * out_strides[i];
is upsetting the sanitizer because it's applying a negative offset to out_ptr?
The message states that address 0x619001e07ca0 "overflows" to 0x619001e07c90. Let's just focus on those last two digits where these values are different. 0xa0 = 160, and 0x90 = 144. So Maybe the sanitizer thinks there was overflow because it's not expected a negative offset to be +=
to the pointer to arrive at a final offset of 144?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I've been saying. The sanitizer doesn't seem to like that the calculated offset is negative. However, it's entirely expected to be negative because it's using negative strides. And as far as I'm aware, there's no undefined behaviour there.
[ghstack-poisoned]
} | ||
|
||
const auto value_type = c10::toValueType(input.scalar_type()); | ||
out.resize_(batched_out_sizes, MemoryFormat::Contiguous); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK for now since the fft operations don't take out=, but in the future this will need to be updated to use resize_output().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't actually match the expected output shape though. I'm not sure the cufft backend could implement an out
argument much better than just copying into it at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Luckily a problem for tomorrow ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @peterbell10! Just one small nit on the formatting of a string. Let me know when this is ready to be merged.
[ghstack-poisoned]
[ghstack-poisoned]
@mruberry the |
ghstack-source-id: 6da7a2d097a4b5b8dc1b4dd709b945f62302cd8f Pull Request resolved: pytorch#46911
Stack from ghstack:
Differential Revision: D25420647