-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chain rule will lead to Nan, sqrt() example #6394
Comments
sqrt((x-1)^2) = |x - 1| by definition. The derivative of |x - 1| at x = 1, is supposed to be |
Both (x-1).abs() and (x-1).norm(2) gives subgradient 0. |
The subgradient for norm at zero was added in #2775 and required some special casing. I don't think we can do much when the user specifies a set of operations by hand, because the derivative of This is something we need to live with in numeric computing. print(np.log(0 ** 0)) # 0.0
print(0 * np.log(0)) # nan Closing as a wontfix. Please let us know if you disagree. |
i find a interesting fact.
for example sqrt( ( x-1 )^2 ) =y and |x-1| = y , their grad of the two x should be same . but their grad of x different when x close to 1
but in the fact, the grad of x for x-1 =y is just
the formula is reasonable in mathematics, but it is different for autograd to compute.
i think the point is when one of sub chain's grad is Nan, the final grad will be Nan.
The text was updated successfully, but these errors were encountered: