New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement copysign #46396
Implement copysign #46396
Conversation
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 310bc5e (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 2 times. |
💊 CI failures summary and remediationsAs of commit e3e5eed (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 2 failures confirmed as flaky and can be ignored:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 216 times. |
[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Not available: byte, char, float/double complex TODO: - [ ] test - [ ] doc - [ ] kernel_vec [ghstack-poisoned]
ghstack-source-id: ad138297abbe2e65bfb661c8446356e003ab152c Pull Request resolved: #46396
@@ -49,6 +49,11 @@ void copy_range(variable_list& out, IndexRange range, at::ArrayRef<Tensor> t) { | |||
std::copy(t.begin(), t.end(), out.begin() + range.first); | |||
} | |||
|
|||
Tensor copysign_tensor_backward(Tensor grad, Tensor self, Tensor other) { | |||
auto result = grad * self.sign() * other.sign(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that you need the self.sign()
here.
It will fail gradcheck when you add it to the list in common_method_invocation.py if the formula is wrong. But I think you will need to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instance:
a = tensor(-1.)
b = tensor(1.)
c = torch.copysign(a, b) = tensor(1)
The derivative of a
is -1
rather than b.sign() = 1
. Any thought on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ho right, the derivative is -1 when we change and 1 otherwise so you need both! Agree with you.
Also there is a corner case at 0
here where sign()
returns 0
. What is copysign()
doing for that? Is the backward formula good for this case as well?
From reading your table of outputs, is the following correct:
If it is the case, then it makes the gradient computation easier to derive for special points. So basically all the "?" in your table above should be 0 as they all correspond to this case where a~=0. |
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]
ghstack-source-id: 0378575506b0ec7464690dbb1b3ca4f7e8569224 Pull Request resolved: #46396
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]
ghstack-source-id: 2937708298c613db16a5b4fc2dfbd9cd45423fec Pull Request resolved: #46396
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]
ghstack-source-id: 5f477aa5d724aeaa1606d1aa690cf4778f433cec Pull Request resolved: #46396
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ [ghstack-poisoned]
ghstack-source-id: edccfde9095eab988ba254c001b784021fa6fe87 Pull Request resolved: #46396
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ [ghstack-poisoned]
ghstack-source-id: 0d5fc918521ebe7ac4e537eb9b13a2e3c18c2091 Pull Request resolved: #46396
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ - [ ] torch.copysign(Number input, Tensor other) [ghstack-poisoned]
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ - [ ] torch.copysign(Number input, Tensor other) [ghstack-poisoned]
Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ - [ ] torch.copysign(Number input, Tensor other) [ghstack-poisoned]
ghstack-source-id: c1a5bac31d691283301d70fa185e689f923b090a Pull Request resolved: #46396
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great.
Just test if the second dtype is a float type or not and only perform that part of the test if it is. |
[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]
ghstack-source-id: 39d7768cbfba6c9c4eb5a090d0182b2423d9f990 Pull Request resolved: #46396
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]
ghstack-source-id: caeb4463c61473183ac336f7d6eecb4fd05483c0 Pull Request resolved: #46396
[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]
ghstack-source-id: fdf7fbb840449a34cf23b09735eba05a1235fb19 Pull Request resolved: #46396
Update because of the following two reasons:
|
Related #38349
Stack from ghstack:
numpy
c = np.copysign(a, b)
This function becomes non-differentiable at
a=0
for anyb
. So, in my opinion, we may set the gradient fora=0
to 0.TODO:
kernel_vecDifferential Revision: D24401366