-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update abs diff rule to 0 at non-differentiable point #98
Conversation
Tracker.jl breakage is unrelated. I believe this is correct. (But of course I do, I am explicitly proponents of this property.) I will merge this tomorrow unless someone raises good objections. |
Re: Tracker.jl I was hoping that was the case. Thanks |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #98 +/- ##
=======================================
Coverage 97.86% 97.86%
=======================================
Files 3 3
Lines 187 187
=======================================
Hits 183 183
Misses 4 4
☔ View full report in Codecov by Sentry. |
git blame shows it was added in #33 and there the explanation is arguably a bit clearer: |
Some additional historical context: It seems the rule for |
@devmotion Thanks for the extra context. |
I think we should revert this. It breaks higher order derivatives for some differentiable functions. E.g. julia> ForwardDiff.hessian(t -> abs(t[1])^2, [0.0])
1×1 Matrix{Float64}:
2.0
(TestDiffRules) pkg> add DiffRules@1.14
Resolving package versions...
Updating `~/TestDiffRules/Project.toml`
[b552c78f] ↑ DiffRules v1.13.0 ⇒ v1.14.0
Updating `~/TestDiffRules/Manifest.toml`
[b552c78f] ↑ DiffRules v1.13.0 ⇒ v1.14.0
julia> ForwardDiff.hessian(t -> abs(t[1])^2, [0.0])
1×1 Matrix{Float64}:
0.0 The example here is of course trivial but |
This PR updates the diffrule for
abs
to return 0 at the non-differentiable point. The current implementation returns 1. Although valid, this can prevent convergence in gradient descent. The implementation in this PR is the behavior the ChainRules.jl docs advises.This also comes with the added benefit of not requiring the type to support the ternary operator such as
IntervalArithmetic.Interval
. This is the use case that led me to make this PR.With this PR:
The diffrule for
abs
has the following comment, which I'm not sure how to interpret. As it doesn't work withIntervalArithmetic.Interval
orIntervals.Intervel
. Additionally, the current definition assumes that 0 is not in the interval.DiffRules.jl/src/rules.jl
Line 71 in 2001650