-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable complex dispatch on min/max functions #50347
Conversation
PROBLEM DESCRIPTION: GitHub issue 46391 suggests that compiler warnings pertaining to "floating-point value does not fit in required integral type" might cause some confusion. These compiler-warnings arise during compilation of the templated function uniform_int(). The warnings are misleading because they arise from the way the compiler compiles templated functions, but the if-else statements in the function obviate the possibilities that the warnings describe. So, the purpose of a fix would only be to fix the compiler warnings, and not to fix any sort of a bug. FIX DESCRIPTION: In the function uniform_int(), the if-else conditions pertaining to types double & float can be removed, and then uniform_int() can be overloaded for both double & float. Since template specialization for functions cannot be partial, we would have to add four overloaded versions of uniform_int(typename T, typename V) pertaining to type T being either double or float, and type V being either uint32_t or uint64_t. An unrelated observation is that the if-else condition pertaining to the type 'double' (line 57 in the original code) was redundant, as line 61 in the original code covered it (std::is_floating_point<T>::value would also have been true for the type 'double').
Based on @malfet's advice, used std::enable_if instead of using four full specializations
I had introduced an empty line by mistake. Removed it
As per malfet's advice, overloaded uniform_int() for floating-point types & non-floating-point types to prevent compiler warnings reported via GitHub issue 46391. Didn't change any code from the last commit in the forked repo, but only edited a comment, so as to re-run tests automatically on Circle CI, and check if any reported build failures on Circle CI were due to this commit, although it seems highly unlikely.
Update fork to latest code on master
PROBLEM: min/max functions were disabled for complex inputs (via dtype checks). However, min/max kernels are still compiled and dispatched for complex. FIX: The aforementioned dispatch should be disabled & we should rely on errors produced by dispatch macro to not run those ops on complex, instead of doing redundant dtype checks.
💊 CI failures summary and remediationsAs of commit 84e680c (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. This comment has been revised 4 times. |
Codecov Report
@@ Coverage Diff @@
## master #50347 +/- ##
==========================================
+ Coverage 80.71% 80.73% +0.01%
==========================================
Files 1904 1904
Lines 206713 206687 -26
==========================================
+ Hits 166858 166861 +3
+ Misses 39855 39826 -29 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anjali411 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Thanks for the prompt review, and for the opportunity to contribute! |
@@ -99,7 +99,7 @@ static void max_all_kernel_impl(Tensor& result, const Tensor& input) { | |||
reduce_all_impl<int64_t>(result, input, lower_bound<int64_t>(), | |||
[=](int64_t a, int64_t b) -> int64_t { return max_impl(a, b); }); | |||
} else { | |||
AT_DISPATCH_ALL_TYPES_AND_COMPLEX(input.scalar_type(), "max_all", [&] { | |||
AT_DISPATCH_ALL_TYPES(input.scalar_type(), "max_all", [&] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a test would be nice, especially to check that (1) we are raising a sane error message in this case and (2) the behavior is consistent between CPU and CUDA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment! I had later realized that I had forgotten to remove dtype checks & was going to submit another PR.
test_complex_unsupported()
exists to test condition 1.
As for checking whether the behavior is consistent between CPU & CUDA, please advise if error messages of the exception for both should be compared. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for checking whether the behavior is consistent between CPU & CUDA, please advise if error messages of the exception for both should be compared. Thanks!
The error messages don't necessarily need to be the same but they should both be readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your advice! test_complex_unsupported currently checks for RuntimeException. Could you please suggest how a test would check for readable error messages? For example, would you recommend checking if an error message has a length of, more than, say, 10? Or would you recommend checking if an error message contains a string such as not implemented or not support (not support can also be present in not supported)? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking if the error message contains some substring seems like a good way to ensure the error message is understandable This can be done with self.assertRaisesRegex
, e.g.
Lines 115 to 116 in 554a1a7
with self.assertRaisesRegex(RuntimeError, r"set_deterministic expects a bool, but got int"): | |
torch.set_deterministic(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking if the error message contains some substring seems like a good way to ensure the error message is understandable This can be done with
self.assertRaisesRegex
, e.g.Lines 115 to 116 in 554a1a7
with self.assertRaisesRegex(RuntimeError, r"set_deterministic expects a bool, but got int"): torch.set_deterministic(1)
Thanks again for your help!
Yes, I was also going to use self.assertRaisesRegex
, as I had earlier used it in test_maximum_minimum_complex()
in test_binary_ufuncs.py.
@anjali411 merged this pull request in 6420071. |
…ispatch for CPU min/max pointwise ops (#50465) Summary: Fixes #50064 **PROBLEM DESCRIPTION:** 1. Had not removed dtype checks for complex types in the previous PR (#50347) for this issue. These type-checks were added in #36377, but are no longer necessary, as we now rely upon dispatch macros to produce error messages. 2. dtype checks in `clamp_max()` and `clamp_min()` for complex inputs had not been removed either. 3. For min/max pointwise ops in TensorCompareKernel.cpp, complex dispatch had not been removed for min/max functions. ### **FIX DESCRIPTION:** **FIX SUMMARY:** 1. Removed dtype checks added in #36377, and added 3 more in TensorCompare.cpp. 2. Removed dtype checks for complex inputs in `clamp_max()` and `clamp_min()`. 3. Disabled complex dispatch for min/max pointwise ops in TensorCompareKernel.cpp. 4. Error messages in the exceptions raised due to min/max ops not being implemented are now checked for containing the text _not support_ (which can also be present in _not supported_), or _not implemented_, so one of them should be a part of error messages, in order for them to be informative. **REASON FOR NOT CHANGING DISPATCH FOR CUDA AND CLAMP OPS**: As for the CUDA min/max operations, their kernels do not seem to be compiled & dispatched for complex types anyway, so no further changes seem to be required. Basically, the dispatch macros currently being used don't have cases for complex types. For example, 1. the reduce CUDA ops use [AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L548-L575) in [ReduceMinMaxKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/ReduceMinMaxKernel.cu), and that macro doesn't allow complex types. 2. In [MinMaxElementwiseKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/MaxMinElementwiseKernel.cu), the CUDA pointwise ops use [`AT_DISPATCH_FLOATING_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L240-L263) for non-integral & non-boolean types, and this marco doesn't have a case for complex types either. 3. [clamp CUDA ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/UnaryOpsKernel.cu#L170-L211) use `AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`, which doesn't have a case for complex types. Similarly, [CPU clamp min/max ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp#L428-L458) use the `AT_DISPATCH_ALL_TYPES_AND `dispatch macro, which doesn't have a case for complex types. **REASON FOR ADDING 3 dtype CHECKS:** There are a few cases in which the methods corresponding to `min_stub()` or `max_stub()` are not called, so dispatch macros don't get invoked, resulting in no exceptions being raised. Hence, `dtype` checks are necessary at 3 places to raise exceptions: 1. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L342 2. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L422 3. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L389 The first dtype check requirement can be verified from the following example Python code based on `test_complex_unsupported()`: ``` import unittest import torch class MyTestCase(unittest.TestCase): def test_1(self): t = torch.tensor((1 + 1j), device='cpu', dtype=torch.complex128) with self.assertRaises(Exception): torch.max(t, dim=0) if __name__ == '__main__': unittest.main() ``` Pull Request resolved: #50465 Reviewed By: mruberry Differential Revision: D25938106 Pulled By: ngimel fbshipit-source-id: 95e2df02ba8583fa3ce87d4a2fdcd60b912dda46
Fixes #50064
PROBLEM:
In issue #36377, min/max functions were disabled for complex inputs (via dtype checks).
However, min/max kernels are still being compiled and dispatched for complex.
FIX:
The aforementioned dispatch has been disabled & we now rely on errors produced
by dispatch macro to not run those ops on complex, instead of doing redundant dtype checks.