Skip to content

Conversation

wesselb
Copy link
Member

@wesselb wesselb commented Sep 15, 2020

This PR simplifies the internals, which have become a little convoluted over time. Also adds extrapolate_fdm (#107).

Important changes:

  • There is only one type FiniteDifferenceMethod, which implements an FDM on an arbitrary grid.
  • There is no function fdm anymore. All estimates are executed by calling the FDM directly.
  • Some keyword arguments have changed place.

TODO:

  • Update documentation.

wesselb and others added 2 commits September 16, 2020 17:00
Co-authored-by: Lyndon White <oxinabox@ucc.asn.au>
Co-authored-by: Lyndon White <oxinabox@ucc.asn.au>
Copy link
Member

@oxinabox oxinabox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished reviewing but i like it so far.

Did we get rid of the history field?
That that was problemantic so i would be glad to rid of it.

Can we do some tests on number of allocations
I deally anything that doesn't allocate from a single call should not allocate from having FDM called on it

@wesselb
Copy link
Member Author

wesselb commented Sep 16, 2020

Did we get rid of the history field?

Yeah, I deleted it entirely. Perhaps a bit of a radical change, but I'm not sure that this had a use anywhere.

I deally anything that doesn't allocate from a single call should not allocate from having FDM called on it

Do you mean by this that everything FDM does has to be without any allocation?

@oxinabox
Copy link
Member

Do you mean by this that everything FDM does has to be without any allocation?

Yeah, more or less.
I mean once you have constructor the FiniteDifferenceMethod object, with its grid of coefficients etc,
all that jvp and j′vp and grad and jacobian are doing is invoking finite differencing in a loop.
and all that that is doing is invoking the function in a loop and computing a linear combination of the return values.
I,e, some scaled subtractions of scalars.
Scaling and substraction of scalars doesn't allocate.

wesselb and others added 2 commits September 16, 2020 18:12
Co-authored-by: Lyndon White <oxinabox@ucc.asn.au>
Co-authored-by: Lyndon White <oxinabox@ucc.asn.au>
@stevengj
Copy link

stevengj commented Sep 17, 2020

I updated Richardson.jl to add a new breaktol parameter to give more control over convergence. If you set breaktol=Inf, it makes the convergence much more robust to initial guesses that are too large (assuming that it eventually converges for small enough h, see below).

I added an example to the documentation showing how this fixes the problem you pointed out above with the derivative of 1/x for small x — you can now put in an arbitrarily large initial guess for that function and it will still converge (at the expense of more iterations, of course).

So, I would require Richardson.jl 1.2.0 and set breaktol=Inf. (At worst, it should shrink the stepsize h until your finite-difference approximation returns NaN from 0/0, but it will still terminate and return the best error obtained prior to that point. In most cases it will hit the rtol before that point, however.)

@wesselb
Copy link
Member Author

wesselb commented Sep 23, 2020

Nice! That sounds great; thanks for adding this. I'll default to breaktol = 10, which appears to work for a few of the more tricky cases. It is probably a good idea to setup more elaborate and rigorous tests.

I am away for another week, but will then get back to this PR to wrap it up.

@stevengj
Copy link

breaktol = Inf should be fine as well — the iteration will still halt due to the other stopping criteria.

@oxinabox
Copy link
Member

oxinabox commented Sep 30, 2020

@wesselb how close are we getting to this being mergable?

What deprecations do we need?

@oxinabox
Copy link
Member

Allocation performance is already better than what it was before:

using const _fdm = central_fdm(5, 1; adapt=5)

Current release:

julia> @btime $_fdm( sin, 0.2);
  9.843 μs (254 allocations: 6.81 KiB)

this PR

julia> @btime $_fdm(sin, 0.2);
  4.767 μs (146 allocations: 4.23 KiB)

@oxinabox
Copy link
Member

oxinabox commented Sep 30, 2020

I spent an hour or so digging into the remaining allocations
beyond #109 (comment)

They are all in estimate_step,
I think they boil down to the fact that adaption causes it to construct a new finite differencing method each time.
I think the solution is that adaptive FiniteDifferenceMethods should, when first constructed, construct the adapted versions of themselves.
This could be done with a field that just stores the next adaption (which itself would have a field that store the next adaption).
Beyond the scope of this PR maybe?
if so we can open an issue to do this as a follow up

@wesselb
Copy link
Member Author

wesselb commented Oct 7, 2020

@oxinabox and @stevengj, I've addressed all your comments. How does this now look?

promoted to a suitable floating point type.
"""
add_tiny(x::T) where T<:AbstractFloat = x + convert(T, 1e-40)
add_tiny(x::Integer) = add_tiny(float(x))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to work with arbitrary real types, including dimensionful types, you should probably use something like

add_tiny(x::T) where T<:Number = x + oneunit(T) * oftype(float(real(one(T))), 1e-40)

However, philosophically I'm very suspicious of this — not only are you assuming an absolute scale for x, but also you are using 1e-40 independent of the precision.

Furthermore, in most cases this code will be equivalent to float(x), since the 1e-40 will be rounded off. What are you trying to accomplish here?

Copy link
Member Author

@wesselb wesselb Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's better to leave support for dimensionful types for a future PR. I'm not familiar with the Unitful package, so I'm not sure what other things would need to change. The typing should currently be restrictive enough to throw method errors if things are not converted to AbstractFloats, so there should be no infinite loops like there were before.

The purpose of add_tiny is to add a number that is small in the absolute sense. An example where this is used is in the step size estimation. The absolute value on the roundoff error on x is computed by eps. If x = 0.0, then eps(x) is very small, and things can NaN out. To prevent this from happening, we use add_tiny(eps(x)) instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make some minimal change, if the above works we can do that,
and then later investigate unitful support more broadly.

Copy link
Member Author

@wesselb wesselb Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above change would add for complex numbers a tiny real number to only the real part. I'm not sure what the behaviour for complex numbers should be, as tiny would then (most logically) refer to the modulus, and there are many complex numbers with modulus 1e-40. I'm happy to go with the above change if you think that this is not important. Alternatively, we can restrict the scope of the function change the docstring of the function to only work on AbstractFloats and Integers; the function is only used internally anyway.

EDIT: For now, I've restricted the docstring. Let me know if you want it otherwise.

Copy link

@stevengj stevengj Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of add_tiny is to add a number that is small in the absolute sense.

There is no such number — there is no such thing as “small in the absolute sense”. Smallness is always relative to something else. You are implicitly assuming some absolute scale here, which cannot possibly hold for arbitrary inputs.

For example, if the function I’m differentiating is log10(x) and I’m calculating the derivative at x=1e-50, then 1e-40 is not a small number in comparison to x.

The whole point of floating-point arithmetic is that it is scale-invariant (until you reach underflow/overflow). I suspect that you are breaking that here. The dimensionful incorrectness is a symptom of that.

Copy link
Member

@oxinabox oxinabox Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_tiny has now been removed as of #125
and a bunch more rework has happened in #130

Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Richardson = "708f8203-808e-40c0-ba2d-98a6953ed40d"

Copy link
Member

@oxinabox oxinabox Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to bump the version at the top of this file
Are we happy enough with the public API to call this v1?
I think we are.

Or we might want to do a last release with deprecations,
and then remove those as a follow up so v1. has no deprecations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what's best. For now, I've changed the version to 0.11.0. Let me know if you want to go for a major release instead.

Copy link
Member

@oxinabox oxinabox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM
See last few comments.
Address those and should be good to go

Do we want deprecations?
What is this release going to be numbers as

@wesselb wesselb merged commit 8292355 into master Oct 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants