Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable complex dispatch on min/max functions #50347

Closed
wants to merge 6 commits into from

Conversation

imaginary-person
Copy link
Contributor

@imaginary-person imaginary-person commented Jan 10, 2021

Fixes #50064

PROBLEM:
In issue #36377, min/max functions were disabled for complex inputs (via dtype checks).
However, min/max kernels are still being compiled and dispatched for complex.

FIX:
The aforementioned dispatch has been disabled & we now rely on errors produced
by dispatch macro to not run those ops on complex, instead of doing redundant dtype checks.

PROBLEM DESCRIPTION:
GitHub issue 46391 suggests that compiler warnings pertaining to "floating-point value does not fit in required integral type" might cause some confusion.

These compiler-warnings arise during compilation of the templated function uniform_int(). The warnings are misleading because they arise from the way the compiler compiles templated functions, but the if-else statements in the function obviate the possibilities that the warnings describe. So, the purpose of a fix would only be to fix the compiler warnings, and not to fix any sort of a bug.


FIX DESCRIPTION:
In the function uniform_int(), the if-else conditions pertaining to types double & float can be removed,
and then uniform_int() can be overloaded for both double & float.
Since template specialization for functions cannot be partial, we would have to add four overloaded versions of uniform_int(typename T, typename V) pertaining to 
type T being either double or float, and type V being either uint32_t or uint64_t.

An unrelated observation is that the if-else condition pertaining to the type 'double' (line 57 in the original code) was redundant, as line 61 in the original code covered it (std::is_floating_point<T>::value would also have been true for the type 'double').
Based on @malfet's advice, used std::enable_if instead of using four full specializations
I had introduced an empty line by mistake. Removed it
As per malfet's advice, overloaded uniform_int() for floating-point types & non-floating-point types to prevent compiler warnings reported via GitHub issue 46391.

Didn't change any code from the last commit in the forked repo, but only edited a comment, so as to re-run tests automatically on Circle CI, and check if any reported build failures on Circle CI were due to this commit, although it seems highly unlikely.
Update fork to latest code on master
PROBLEM:
min/max functions were disabled for complex inputs (via dtype checks). 
However, min/max kernels are still compiled and dispatched for complex.

FIX:
The aforementioned dispatch should be disabled & we should rely on errors 
produced by dispatch macro to not run those ops on complex, instead of doing redundant dtype checks.
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 10, 2021

💊 CI failures summary and remediations

As of commit 84e680c (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 4 times.

@codecov
Copy link

codecov bot commented Jan 10, 2021

Codecov Report

Merging #50347 (84e680c) into master (375c30a) will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #50347      +/-   ##
==========================================
+ Coverage   80.71%   80.73%   +0.01%     
==========================================
  Files        1904     1904              
  Lines      206713   206687      -26     
==========================================
+ Hits       166858   166861       +3     
+ Misses      39855    39826      -29     

@zhangguanheng66 zhangguanheng66 added module: complex Related to complex number support in PyTorch module: internals Related to internal abstractions in c10 and ATen triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jan 11, 2021
Copy link
Contributor

@anjali411 anjali411 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anjali411 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@imaginary-person
Copy link
Contributor Author

LGTM thanks!

Thanks for the prompt review, and for the opportunity to contribute!

@@ -99,7 +99,7 @@ static void max_all_kernel_impl(Tensor& result, const Tensor& input) {
reduce_all_impl<int64_t>(result, input, lower_bound<int64_t>(),
[=](int64_t a, int64_t b) -> int64_t { return max_impl(a, b); });
} else {
AT_DISPATCH_ALL_TYPES_AND_COMPLEX(input.scalar_type(), "max_all", [&] {
AT_DISPATCH_ALL_TYPES(input.scalar_type(), "max_all", [&] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a test would be nice, especially to check that (1) we are raising a sane error message in this case and (2) the behavior is consistent between CPU and CUDA

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment! I had later realized that I had forgotten to remove dtype checks & was going to submit another PR.

test_complex_unsupported() exists to test condition 1.

As for checking whether the behavior is consistent between CPU & CUDA, please advise if error messages of the exception for both should be compared. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for checking whether the behavior is consistent between CPU & CUDA, please advise if error messages of the exception for both should be compared. Thanks!

The error messages don't necessarily need to be the same but they should both be readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your advice! test_complex_unsupported currently checks for RuntimeException. Could you please suggest how a test would check for readable error messages? For example, would you recommend checking if an error message has a length of, more than, say, 10? Or would you recommend checking if an error message contains a string such as not implemented or not support (not support can also be present in not supported)? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if the error message contains some substring seems like a good way to ensure the error message is understandable This can be done with self.assertRaisesRegex, e.g.

pytorch/test/test_torch.py

Lines 115 to 116 in 554a1a7

with self.assertRaisesRegex(RuntimeError, r"set_deterministic expects a bool, but got int"):
torch.set_deterministic(1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if the error message contains some substring seems like a good way to ensure the error message is understandable This can be done with self.assertRaisesRegex, e.g.

pytorch/test/test_torch.py

Lines 115 to 116 in 554a1a7

with self.assertRaisesRegex(RuntimeError, r"set_deterministic expects a bool, but got int"):
torch.set_deterministic(1)

Thanks again for your help!
Yes, I was also going to use self.assertRaisesRegex, as I had earlier used it in test_maximum_minimum_complex() in test_binary_ufuncs.py.

@facebook-github-bot
Copy link
Contributor

@anjali411 merged this pull request in 6420071.

facebook-github-bot pushed a commit that referenced this pull request Jan 18, 2021
…ispatch for CPU min/max pointwise ops (#50465)

Summary:
Fixes #50064

**PROBLEM DESCRIPTION:**
1. Had not removed dtype checks for complex types in the previous PR (#50347) for this issue.
These type-checks were added in #36377, but are no longer necessary,
as we now rely upon dispatch macros to produce error messages.
2. dtype checks in `clamp_max()` and `clamp_min()` for complex inputs had not been removed either.
3. For min/max pointwise ops in TensorCompareKernel.cpp, complex dispatch had not been removed for min/max functions.

### **FIX DESCRIPTION:**
**FIX SUMMARY:**
1. Removed dtype checks added in #36377, and added 3 more in TensorCompare.cpp.
2. Removed dtype checks for complex inputs in `clamp_max()` and `clamp_min()`.
3.  Disabled complex dispatch for min/max pointwise ops in TensorCompareKernel.cpp.
4. Error messages in the exceptions raised due to min/max ops not being implemented are now checked for containing the text _not support_ (which can also be present in _not supported_), or _not implemented_, so one of them should be a part of error messages, in order for them to be informative.

**REASON FOR NOT CHANGING DISPATCH FOR CUDA AND CLAMP OPS**:

As for the CUDA min/max operations, their kernels do not seem to be compiled & dispatched for complex types anyway, so no further changes seem to be required. Basically, the dispatch macros currently being used don't have cases for complex types.

For example,

1. the reduce CUDA ops use [AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L548-L575) in [ReduceMinMaxKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/ReduceMinMaxKernel.cu), and that macro doesn't allow complex types.

2. In [MinMaxElementwiseKernel.cu](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/MaxMinElementwiseKernel.cu), the CUDA pointwise ops use [`AT_DISPATCH_FLOATING_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Dispatch.h#L240-L263) for non-integral & non-boolean types, and this marco doesn't have a case for complex types either.

3. [clamp CUDA ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/UnaryOpsKernel.cu#L170-L211) use `AT_DISPATCH_ALL_TYPES_AND2 (https://github.com/pytorch/pytorch/commit/678fe9f0771a5cd98ead214363d70480ba03000d)`, which doesn't have a case for complex types.

Similarly, [CPU clamp min/max ops](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp#L428-L458) use the `AT_DISPATCH_ALL_TYPES_AND `dispatch macro, which doesn't have a case for complex types.

**REASON FOR ADDING 3 dtype CHECKS:**
There are a few cases in which the methods corresponding to `min_stub()` or `max_stub()` are not called, so dispatch macros don't get invoked, resulting in no exceptions being raised. Hence, `dtype` checks are necessary at 3 places to raise exceptions:

1. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L342
2. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L422
3. https://github.com/pytorch/pytorch/blob/52dcc7299925de055d330781d2fe0dad71182829/aten/src/ATen/native/TensorCompare.cpp#L389

The first dtype check requirement can be verified from the following example Python code based on `test_complex_unsupported()`:
```
import unittest
import torch

class MyTestCase(unittest.TestCase):

   def test_1(self):
      t = torch.tensor((1 + 1j), device='cpu', dtype=torch.complex128)
      with self.assertRaises(Exception):
         torch.max(t, dim=0)

if __name__ == '__main__':
    unittest.main()
```

Pull Request resolved: #50465

Reviewed By: mruberry

Differential Revision: D25938106

Pulled By: ngimel

fbshipit-source-id: 95e2df02ba8583fa3ce87d4a2fdcd60b912dda46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged module: complex Related to complex number support in PyTorch module: internals Related to internal abstractions in c10 and ATen open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Complex dispatch should be disabled on min/max functions
6 participants