Implement copysign #46396

ejguan · 2020-10-15T15:19:59Z

Related #38349
Stack from ghstack:

Implement copysign #46396 Implement copysign

No in-place function
No method
Optional output
Available: byte, char, bool, int, short, long, float, double, half
Integral promoted to float
Not available: float/double complex

c = np.copysign(a, b)

a	b	c	a.grad
-1	-1	-1	1
-0	-1	-0	0
0	-1	-0	0
1	-1	-1	-1
-1	-0	-1	1
-0	-0	-0	0
0	-0	-0	0
1	-0	-1	-1
-1	0	1	-1
-0	0	0	0
0	0	0	0
1	0	1	1
-1	1	1	-1
-0	1	0	0
0	1	0	0
1	1	1	1

This function becomes non-differentiable at a=0 for any b. So, in my opinion, we may set the gradient for a=0 to 0.

TODO:

test (cpu/gpu)
doc
~~kernel_vec~~

Differential Revision: D24401366

[ghstack-poisoned]

facebook-github-bot · 2020-10-15T15:20:23Z

💊 CI failures summary and remediations

As of commit 310bc5e (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 2 times.

aten/src/ATen/native/BinaryOps.cpp

dr-ci · 2020-10-15T15:26:56Z

💊 CI failures summary and remediations

As of commit e3e5eed (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

2 failures confirmed as flaky and can be ignored:

pytorch_libtorch_linux_xenial_cuda11_0_cudnn8_py3_gcc7_build
pytorch_linux_xenial_cuda9_2_cudnn7_py3_gcc5_4_build

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 216 times.

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Not available: byte, char, float/double complex TODO: - [ ] test - [ ] doc - [ ] kernel_vec [ghstack-poisoned]

ghstack-source-id: ad138297abbe2e65bfb661c8446356e003ab152c Pull Request resolved: #46396

albanD · 2020-10-15T17:58:40Z

torch/csrc/autograd/FunctionsManual.cpp

@@ -49,6 +49,11 @@ void copy_range(variable_list& out, IndexRange range, at::ArrayRef<Tensor> t) {
  std::copy(t.begin(), t.end(), out.begin() + range.first);
 }

+Tensor copysign_tensor_backward(Tensor grad, Tensor self, Tensor other) {
+  auto result = grad * self.sign() * other.sign();


I am not sure that you need the self.sign() here.
It will fail gradcheck when you add it to the list in common_method_invocation.py if the formula is wrong. But I think you will need to change this.

For instance:

a = tensor(-1.)

b = tensor(1.)

c = torch.copysign(a, b) = tensor(1)

The derivative of a is -1 rather than b.sign() = 1. Any thought on that?

Ho right, the derivative is -1 when we change and 1 otherwise so you need both! Agree with you.

Also there is a corner case at 0 here where sign() returns 0. What is copysign() doing for that? Is the backward formula good for this case as well?

albanD · 2020-10-15T21:03:43Z

From reading your table of outputs, is the following correct:

copysign(a, b) = {
    abs(a) if b >=0
    -abs(a) if b <0
}

If it is the case, then it makes the gradient computation easier to derive for special points.
In particular, for b>=0 and a~=0, since it is a convex function and we select the min-norm subgradient here: 0
And for b<0 and a~=0, this is a concave function and we select the min-norm supergradient: 0

So basically all the "?" in your table above should be 0 as they all correspond to this case where a~=0.

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 0378575506b0ec7464690dbb1b3ca4f7e8569224 Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 2937708298c613db16a5b4fc2dfbd9cd45423fec Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 5f477aa5d724aeaa1606d1aa690cf4778f433cec Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: edccfde9095eab988ba254c001b784021fa6fe87 Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 0d5fc918521ebe7ac4e537eb9b13a2e3c18c2091 Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ - [ ] torch.copysign(Number input, Tensor other) [ghstack-poisoned]

ghstack-source-id: c1a5bac31d691283301d70fa185e689f923b090a Pull Request resolved: #46396

test/test_torch.py

mruberry

Nice work, @ejguan. I just made a few comments on the test_torch.py test. This is looking really good. @zou3519, you'll shepherd this the rest of the way, right?

mruberry · 2020-10-30T21:29:59Z

I removed tests with explicit expected outputs, because I think it's enough to compare the results between PyTorch and NumPy with random cases and special cases (0.0/-0.0/inf/-inf/nan).

Sounds great.

And, I did not use cross product of dtypes because I need to enumerate all types (float/integer/boolean) for checking type promotion and result for both arguments of copysign. But, all the special cases (0.0/-0.0/inf/-inf/nan) are only required floating precision input as the second argument of copysign.

Just test if the second dtype is a float type or not and only perform that part of the test if it is.

ejguan · 2020-10-30T22:25:09Z

Nice work, @ejguan. I just made a few comments on the test_torch.py test. This is looking really good. @zou3519, you'll shepherd this the rest of the way, right?

Thank you so much mike for reviewing the code and giving suggestions. @mruberry

test/test_torch.py

torch/testing/_internal/common_methods_invocations.py

torch/_torch_docs.py

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]

ghstack-source-id: 39d7768cbfba6c9c4eb5a090d0182b2423d9f990 Pull Request resolved: #46396

test/test_torch.py

zou3519

lgtm

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]

ghstack-source-id: caeb4463c61473183ac336f7d6eecb4fd05483c0 Pull Request resolved: #46396

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]

ghstack-source-id: fdf7fbb840449a34cf23b09735eba05a1235fb19 Pull Request resolved: #46396

ejguan · 2020-11-03T19:28:28Z

Update because of the following two reasons:

torch.bfloat16 can not hold -nan
torch.half can not hold -nan on CUDA

facebook-github-bot · 2020-11-04T17:10:49Z

@ejguan merged this pull request in f1ac63d.

Implement copysign

310bc5e

[ghstack-poisoned]

ejguan requested review from albanD and apaszke as code owners October 15, 2020 15:19

ejguan marked this pull request as draft October 15, 2020 15:20

ejguan changed the title ~~Implement copysign~~ [WIP] Implement copysign Oct 15, 2020

zou3519 reviewed Oct 15, 2020

View reviewed changes

aten/src/ATen/native/BinaryOps.cpp Outdated Show resolved Hide resolved

ejguan added a commit that referenced this pull request Oct 15, 2020

Implement copysign (Fix bugs)

1fbd8ab

ghstack-source-id: ad138297abbe2e65bfb661c8446356e003ab152c Pull Request resolved: #46396

albanD reviewed Oct 15, 2020

View reviewed changes

ejguan added a commit that referenced this pull request Oct 15, 2020

Implement copysign (fix derivatives)

0264758

ghstack-source-id: 0378575506b0ec7464690dbb1b3ca4f7e8569224 Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 16, 2020

Implement copysign (fix derivatives)

9770bf6

ghstack-source-id: 2937708298c613db16a5b4fc2dfbd9cd45423fec Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 16, 2020

Implement copysign (add docs)

29ace34

ghstack-source-id: 5f477aa5d724aeaa1606d1aa690cf4778f433cec Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 16, 2020

Implement copysign (fix docs)

0fb92d9

ghstack-source-id: edccfde9095eab988ba254c001b784021fa6fe87 Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 19, 2020

Implement copysign (fix CUDA)

52fe375

ghstack-source-id: 0d5fc918521ebe7ac4e537eb9b13a2e3c18c2091 Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 29, 2020

Implement copysign (Fix Test)

fb57aba

ghstack-source-id: c1a5bac31d691283301d70fa185e689f923b090a Pull Request resolved: #46396

facebook-github-bot added the cla signed label Oct 30, 2020

mruberry reviewed Oct 30, 2020

View reviewed changes

test/test_torch.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 30, 2020

View reviewed changes

test/test_torch.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 30, 2020

View reviewed changes

test/test_torch.py Outdated Show resolved Hide resolved

mruberry approved these changes Oct 30, 2020

View reviewed changes