chain rule will lead to Nan, sqrt() example #6394

zchky · 2018-04-08T03:15:39Z

i find a interesting fact.

for example sqrt( ( x-1 )^2 ) =y and |x-1| = y , their grad of the two x should be same . but their grad of x different when x close to 1

>> x=Variable(torch.Tensor([1]),requires_grad=True)
>> y=torch.sqrt(torch.pow(x - 1,0))
>> y.backward()
>> x.grad
Variable containing:
nan
[torch.FloatTensor of size 1]

but in the fact, the grad of x for x-1 =y is just

>> x=Variable(torch.Tensor([1]),requires_grad=True)
>> y=x-1
>> y.backward()
>> x.grad
Variable containing:
 1
[torch.FloatTensor of size 1]

the formula is reasonable in mathematics, but it is different for autograd to compute.
i think the point is when one of sub chain's grad is Nan, the final grad will be Nan.

The text was updated successfully, but these errors were encountered:

vishwakftw · 2018-04-08T03:28:19Z

sqrt((x-1)^2) = |x - 1| by definition. The derivative of |x - 1| at x = 1, is supposed to be nan, unless a subgradient is specified.

ruotianluo · 2018-04-08T03:52:40Z

Both (x-1).abs() and (x-1).norm(2) gives subgradient 0.

fmassa · 2018-04-11T09:07:02Z

The subgradient for norm at zero was added in #2775 and required some special casing.

I don't think we can do much when the user specifies a set of operations by hand, because the derivative of sqrt at zero is inf.

This is something we need to live with in numeric computing.
For example, log(x ** x) == x * log(x), but when x is zero, the limit (which is 0) gets undefined using the second expression:

print(np.log(0 ** 0))  # 0.0
print(0 * np.log(0))  # nan

Closing as a wontfix. Please let us know if you disagree.

fmassa closed this as completed Apr 11, 2018

fmassa added the wontfix label Apr 11, 2018

gsygsy96 mentioned this issue May 31, 2019

triangle_loss_fn returns 'nan' akanazawa/cmr#11

Closed

lilanxiao mentioned this issue Apr 25, 2021

Function 'SqrtBackward' returned nan values in its 0th output. Bug in min_enclosing_box.py? lilanxiao/Rotated_IoU#20

Closed

Padarn mentioned this issue Apr 26, 2022

EquilibriumAggregation global aggregation layer pyg-team/pytorch_geometric#4522

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chain rule will lead to Nan, sqrt() example #6394

chain rule will lead to Nan, sqrt() example #6394

zchky commented Apr 8, 2018 •

edited

Loading

vishwakftw commented Apr 8, 2018

ruotianluo commented Apr 8, 2018

fmassa commented Apr 11, 2018

chain rule will lead to Nan, sqrt() example #6394

chain rule will lead to Nan, sqrt() example #6394

Comments

zchky commented Apr 8, 2018 • edited Loading

vishwakftw commented Apr 8, 2018

ruotianluo commented Apr 8, 2018

fmassa commented Apr 11, 2018

zchky commented Apr 8, 2018 •

edited

Loading