Add adaptive function bound computation #17

ararslan · 2019-05-16T23:14:43Z

This introduces a type-based approach that works slightly differently than the current API. Upon construction of a FDMethod subtype, the order of the method and the order of the derivative are checked for agreement, and the grid and coefficients are computed and stored. Instances of the types are callable, and doing so produces the value of the derivative. It works recursively when adaptively computing the bound, which results in a better step size and thus a (hopefully) more robust estimate of the derivative.

"Why types?" you may be asking. It just seemed like an efficient way of expressing things and storing intermediate values.

"Where did this adaptive thing come from?" From none other than Wessel himself: https://github.com/wesselb/fdm.

This introduces a type-based approach that works slightly differently than the current API. Upon construction of a `FDMethod` subtype, the order of the method and the order of the derivative are checked for agreement, and the grid and coefficients are computed and stored. Instances of the types are callable, and doing so produces the value of the derivative. It works recursively when adaptively computing the bound, which results in a better step size and thus a (hopefully) more robust estimate of the derivative.

codecov · 2019-05-16T23:23:17Z

Codecov Report

Merging #17 into master will increase coverage by 2.5%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #17     +/-   ##
======================================
+ Coverage    92.5%   95%   +2.5%     
======================================
  Files           4     4             
  Lines          80   120     +40     
======================================
+ Hits           74   114     +40     
  Misses          6     6

Impacted Files	Coverage Δ
src/methods.jl	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d6f656...cc9bf63. Read the comment docs.

ararslan · 2019-05-16T23:23:39Z

I've marked this as a draft because I'd like feedback in general, but especially on:

The implementation. Is this the best way to go about computing the bound?
The API. Should this replace the current API? How could it be improved? Currently it duplicates all existing logic and lives separately from the existing API.
Should we rename and reuse FDMReport for this? We could make it mutable and have it update the stored values on each adaptation step.
What are some corner cases that could be used to test that this provides improvements over the non-adaptive approach?

src/methods.jl

wesselb · 2019-05-17T11:47:31Z

src/methods.jl

+
+    # Adaptively compute the bound on the function and derivative values, if applicable.
+    if 0 < adapt < p
+        newm = (M.name.wrapper)(p + 1, q)


I think bound should be set to an upper bound on the orderth (i.e. pth) derivative, so I believe this should be newm = (M.name.wrapper)(p + 1, p). Here p + 1 comes from that the order must be strictly greater than the derivative to estimate (which also means that we have a choice here). Furthermore, order in our FDM setup refers to the number of grid points, which is easily interpretable, but which is also one off from what it typically means. Perhaps we should adjust this.

wesselb · 2019-05-17T11:49:40Z

src/methods.jl

+    if 0 < adapt < p
+        newm = (M.name.wrapper)(p + 1, q)
+        dfdx, = fdm(newm, f, x; eps=eps, bound=bound, adapt=(adapt - 1))
+        bound = maxabs(dfdx)


We have to be careful here to ensure that bound > 0. A simple solution is to add a tiny number, e.g. 1e-20. (Note that the case bound = 0 can easily occur, e.g. when adapting a p = 2 and q = 1 method for a linear function.)

src/methods.jl

wesselb · 2019-05-17T12:08:11Z

First of all, this looks great!!

The implementation. Is this the best way to go about computing the bound?

I have added some comments regarding the implementation.

It might also be worthwhile to implement the condition number mechanism to make the procedure even more robust. I think robustness of the estimation is an important feature.

The API. Should this replace the current API? How could it be improved? Currently it duplicates all existing logic and lives separately from the existing API.

I think the new API with the types is pretty sensible and could replace the current API. We might still want to support the old API by redirecting to the new API to maintain compatibility.

Should we rename and reuse FDMReport for this? We could make it mutable and have it update the stored values on each adaptation step.

Yeah, I'm not sure if having a separate type (i.e. FDMReport) just to store this information is the best way to go about it. Perhaps the new types (i.e. Forward etc) can instead store this information, which would be overwritten at every computation, thus always referring to the last computation. Any ideas on this?

What are some corner cases that could be used to test that this provides improvements over the non-adaptive approach?

A nice example is the estimation of the derivative of x -> 1 / x at an x near zero:

>>> central_fdm(5, 1, adapt=0)(lambda x: 1 / x, 1e-3) + 1e6
670283.9150891255

>>> central_fdm(5, 1, adapt=1)(lambda x: 1 / x, 1e-3) + 1e6
2.133892849087715e-05

>>> central_fdm(5, 1, adapt=2)(lambda x: 1 / x, 1e-3) + 1e6
5.097826942801476e-07

In general, I have not found much use of having more than two adaptation steps, and adapt=1 should probably be the default. For difficult functions, having a high number of adaptation steps may even prove harmful, because that requires estimating high-order derivatives, which can fail catastrophically for difficult functions.

Furthermore, the cases bound = 0 and eps = 0 should be tested. Having bound > 0 and eps > 0 is absolutely critical. For example,

>>> central_fdm(3, 2, adapt=0)(lambda x: x, 1.0)
0.0

gives 0.0 exactly, and this could introduce division by zero, if we're not careful.

wesselb · 2019-05-19T16:23:22Z

This looks really good to me now! Let's see what others have to say.

Another example where adaptation is useful is estimating the derivative of log near 0:

>>> forward_fdm(5, 1, adapt=0)(np.log, 1e-3)
996.0418094849194

>>> forward_fdm(5, 1, adapt=1)(np.log, 1e-3)
999.9999999445137

willtebbutt

I really like this. It's going to make our lives much more straightforward :)

I'm happy for this to be merged as is, but it would be good to implement some tests around the log example @wesselb gave, and maybe something involving inverting a slightly awkward matrix.

We should also test that things work as expected with adaptation near the edge of domains. e.g. with the log example, you really do have to use forward_fdm when working close to zero. You could check that backward_fdm works with -log for numbers just less than zero etc.

willtebbutt · 2019-05-20T09:32:12Z

src/methods.jl

+
+A tiny number added to some quantities to ensure that division by 0 does not occur.
+"""
+const TINY = 1e-20


Maybe we want to replace this constant by eps()? Or have I misunderstood what this is going to be used for?

Yeah, we could do that. I was just following what's done in the Python implementation.

The tiny number here should be absolutely small, so I'm not sure eps() makes sense. What we had before, eps(0.0), would be ideal, if it not were at the edge of the floating point range. 1e-20 is a compromise to have something absolutely small to what we usually work with, whilst not it being too small to cause issues with the numerics.

Perhaps the above comment doesn't really make sense. TINY is supposed to be small compared to any number that we typically encounter, but eps() is only small to numbers on the order of magnitude of 1 (or bigger). If we consider 1e-6 on the lower end of sizes of numbers we typically encounter, then eps(1e-6) = 2e-22 should be sufficient, which is where I originally got the 1e-20 from. Does that make more sense? 😬

willtebbutt · 2019-05-20T09:32:16Z

src/methods.jl

+    eps::Real
+    bound::Real
+    step::Real
+    accuracy::Real


Do we care that the types of eps, bound, etc are abstract? Does this likely have performance implications?

It does, but this is a mutable type anyway, so it will already not be terribly performant. The problem with having them concrete is that when the History object is constructed, we don't know the types of the values it will end up holding, since this is constructed before we call the FDMethod on a function. The function could return Float32, for example, and we wouldn't necessarily want to promote to Float64. (Though a case could be made to do so.)

ararslan · 2019-05-20T15:53:16Z

Another thing I haven't yet done here is have the current API simply call the new API; the code for both exists and is largely identical.

ararslan · 2019-05-20T22:32:27Z

For example,
>>> central_fdm(3, 2, adapt=0)(lambda x: x, 1.0)
0.0
gives 0.0 exactly, and this could introduce division by zero, if we're not careful.

julia> FDM.Central(3, 2, adapt=0)(identity, 1.0)
0.0

julia> central_fdm(3, 2)(identity, 1.0)
7.277897151650305e-7

Hmmmmm.

wesselb · 2019-05-21T09:40:09Z

Hmm, interesting. The Python implementation gives me a different result:

>>> central_fdm(3, 2, adapt=0)(lambda x: x, 1.0)
0.0

>>> central_fdm(3, 2, adapt=1)(lambda x: x, 1.0)
0.0

I'll have a closer look at the code.

ararslan · 2019-05-21T15:46:44Z

The behavior implemented here matches Python, but what's odd is that the original implementation does not.

wesselb · 2019-05-21T16:11:39Z

Oh, I totally missed that the second call uses the old API; didn't have my morning coffee yet...

I think it's due to the condition number:

julia> central_fdm(3, 2; M=1)(x -> x, 1.0)
7.277897151650305e-7

julia> central_fdm(3, 2; M=100)(x -> x, 1.0)
0.0

This change consistutes a near rewrite of method handling.

ararslan · 2019-05-27T21:21:52Z

Okay, assuming CI agrees, I think things should be in good shape now. Unfortunately method handling was almost entirely rewritten. 😐 Also note that I've added a Nonstandard method which represents a custom grid, otherwise we would lose that functionality with this change, however we can't do adaptive bounds computation on a custom grid.

ararslan · 2019-05-27T22:18:42Z

UGH. Adaptively computing the bounds actually results in more test failures in ChainRules. I was hoping it would make it more robust to changes in the random seed.

If it's too big, Bad Things can happen. See e.g. the added test.

ararslan · 2019-05-28T19:50:41Z

This no longer destroys Nabla's tests once very minor adjustments are made in Nabla to facilitate this change.

wesselb · 2019-05-29T13:24:38Z

This looks very good to me! Happy to merge, if everyone else is so as well.

ararslan requested review from wesselb and willtebbutt May 16, 2019 23:14

wesselb reviewed May 17, 2019

View reviewed changes

src/methods.jl Outdated Show resolved Hide resolved

wesselb reviewed May 17, 2019

View reviewed changes

src/methods.jl Outdated Show resolved Hide resolved

ararslan added 2 commits May 17, 2019 11:13

WIP: Add more things

a2511fe

WIP: Store computational history in a History object

dbdfe2a

willtebbutt approved these changes May 20, 2019

View reviewed changes

Restructure to use the new API, add deprecations

2638f47

This change consistutes a near rewrite of method handling.

ararslan marked this pull request as ready for review May 27, 2019 21:30

ararslan changed the title ~~WIP/RFC: Add adaptive function bound computation~~ Add adaptive function bound computation May 27, 2019

ararslan added 2 commits May 28, 2019 09:55

Add a max_step parameter to cap the step size

142d50e

If it's too big, Bad Things can happen. See e.g. the added test.

Pass max_step during adaptive steps as well

cc9bf63

wesselb merged commit e40ed35 into master May 29, 2019

ararslan deleted the aa/adapt branch May 29, 2019 18:41

Add adaptive function bound computation #17

Add adaptive function bound computation #17

Uh oh!

Conversation

ararslan commented May 16, 2019

Uh oh!

codecov bot commented May 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ararslan commented May 16, 2019

Uh oh!

Uh oh!

wesselb May 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wesselb May 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wesselb commented May 17, 2019

Uh oh!

wesselb commented May 19, 2019

Uh oh!

willtebbutt left a comment

Choose a reason for hiding this comment

Uh oh!

willtebbutt May 20, 2019

Choose a reason for hiding this comment

Uh oh!

ararslan May 20, 2019

Choose a reason for hiding this comment

Uh oh!

wesselb May 20, 2019

Choose a reason for hiding this comment

Uh oh!

wesselb May 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

willtebbutt May 20, 2019

Choose a reason for hiding this comment

Uh oh!

ararslan May 20, 2019

Choose a reason for hiding this comment

Uh oh!

ararslan commented May 20, 2019

Uh oh!

ararslan commented May 20, 2019

Uh oh!

wesselb commented May 21, 2019

Uh oh!

ararslan commented May 21, 2019

Uh oh!

wesselb commented May 21, 2019

Uh oh!

ararslan commented May 27, 2019

Uh oh!

ararslan commented May 27, 2019

Uh oh!

ararslan commented May 28, 2019

Uh oh!

wesselb commented May 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented May 16, 2019 •

edited

Loading

wesselb May 17, 2019 •

edited

Loading

wesselb May 17, 2019 •

edited

Loading

wesselb May 20, 2019 •

edited

Loading