Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic finite differences #666

Merged
merged 41 commits into from
May 17, 2021
Merged

Basic finite differences #666

merged 41 commits into from
May 17, 2021

Conversation

yannikschaelte
Copy link
Member

@yannikschaelte yannikschaelte commented May 13, 2021

Implementation of central finite differences as an objective class. Can be wrapped around existing objectives to provide finite difference approximations to derivatives (grad, hess, sres).

Not implemented yet:

  • Choice of method (forward, backward, centered=only),
  • step length adaption,
  • efficient handling of fixed parameters.
  • one could save a few evaluations here and there, but as those are only O(n_par), I currently don't care ...

closes #18

@yannikschaelte
Copy link
Member Author

yannikschaelte commented May 13, 2021

waiting for #663

@codecov-commenter
Copy link

codecov-commenter commented May 13, 2021

Codecov Report

Merging #666 (d7fb77e) into develop (1b63597) will decrease coverage by 52.87%.
The diff coverage is 23.17%.

Impacted file tree graph

@@             Coverage Diff              @@
##           develop     #666       +/-   ##
============================================
- Coverage    87.71%   34.83%   -52.88%     
============================================
  Files           95       96        +1     
  Lines         6136     6350      +214     
============================================
- Hits          5382     2212     -3170     
- Misses         754     4138     +3384     
Impacted Files Coverage Δ
pypesto/__init__.py 100.00% <ø> (ø)
pypesto/objective/history.py 47.68% <ø> (-47.69%) ⬇️
pypesto/optimize/__init__.py 100.00% <ø> (ø)
pypesto/objective/finite_difference.py 15.00% <15.00%> (ø)
pypesto/objective/base.py 53.14% <85.71%> (-34.58%) ⬇️
pypesto/objective/__init__.py 100.00% <100.00%> (ø)
pypesto/objective/aggregated.py 25.49% <100.00%> (-74.51%) ⬇️
pypesto/objective/function.py 82.65% <100.00%> (-5.31%) ⬇️
pypesto/optimize/optimizer.py 82.71% <100.00%> (-7.40%) ⬇️
pypesto/objective/aesara.py 0.00% <0.00%> (-92.93%) ⬇️
... and 70 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1b63597...d7fb77e. Read the comment docs.

pypesto/objective/aggregated.py Outdated Show resolved Hide resolved
pypesto/optimize/optimizer.py Outdated Show resolved Hide resolved
pypesto/objective/function.py Show resolved Hide resolved
yannikschaelte and others added 2 commits May 14, 2021 23:00
Co-authored-by: Paul Jonas Jost <70631928+PaulJonasJost@users.noreply.github.com>
Co-authored-by: Paul Jonas Jost <70631928+PaulJonasJost@users.noreply.github.com>
@jvanhoefer
Copy link
Member

I currently don't see an example, might be nice to have something like that? :)

@yannikschaelte
Copy link
Member Author

I currently don't see an example, might be nice to have something like that? :)

One could make one, but I think that would be rather artificial at the moment and refactored anyway when we decide on how to structure the pyPESTO tutorial, therefore I would rather wait until then with an example?

Copy link
Contributor

@paulstapor paulstapor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, looks good

pypesto/objective/aggregated.py Show resolved Hide resolved
return request.param


def test_fds(fd_method):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test, does exactly what it should!

pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
Comment on lines 40 to 46
delta_fun:
FD step sizes for gradient and Hessian.
Can be either a float, or a :class:`np.ndarray` of shape (n_par,)
for different step sizes for different coordinates.
delta_res:
FD step sizes for residual sensitivities.
Similar to `delta_fun`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have a distinct step size for sres, but not for the Hessian? If we already open the box of allowing different step sizes for different derivatives: Then also provide one for "hessian via gradients"... right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, adding one

FD step sizes for residual sensitivities.
Similar to `delta_fun`.
method:
Method to calculate FDs. Currently, only "center" is supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall for good reasons... Would even omit this option totally. Central FD are typically (and from theory) much more accurate... I don't see a good reason for only allowing forward or backward ones...

pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
pypesto/objective/finite_difference.py Show resolved Hide resolved
Comment on lines +375 to +376
hess[ix1, ix2] = hess[ix2, ix1] = \
(fpp - fpm - fmp + fmm) / (delta1_val * delta2_val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also a particular form of 2nd order FD for the diagonal entries of the Hessian, which should be however equivalent to this here... Probably, the particular form is just a simplification, which comes out after plugging in the values from this general form here. However, Might make sense to double check. At least, I would say this here is correct...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note the diagonal is computed a few lines above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, indeed... Sorry, seems like I missed that one...

@paulstapor
Copy link
Contributor

I currently don't see an example, might be nice to have something like that? :)

One could make one, but I think that would be rather artificial at the moment and refactored anyway when we decide on how to structure the pyPESTO tutorial, therefore I would rather wait until then with an example?

Would also agree to add an example. However, also fine with adding one in the tutorial. In this case, please open an issue that this won't be forgotten! :)

@yannikschaelte
Copy link
Member Author

I currently don't see an example, might be nice to have something like that? :)

One could make one, but I think that would be rather artificial at the moment and refactored anyway when we decide on how to structure the pyPESTO tutorial, therefore I would rather wait until then with an example?

Would also agree to add an example. However, also fine with adding one in the tutorial. In this case, please open an issue that this won't be forgotten! :)

Added an example in the docstring now at least ...

Copy link
Member

@jvanhoefer jvanhoefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, only minor comments :) (I am a bit curious about the "inconsistency" in the hessian between mixed and non-mixed 2nd derivatives...)

pypesto/objective/base.py Outdated Show resolved Hide resolved
Comment on lines +256 to +257
or 1 in sensi_orders and not self.has_sres
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
or 1 in sensi_orders and not self.has_sres
)
or 1 in sensi_orders and not self.has_sres
or 2 in sensi_orders
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if Hessian information is asked for in MODE_RES? (Even though current residual-based optimizers might not need that info)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not aware of any optimizer that does as most stick to the linear structure ... but could of course be added then.


if isinstance(delta_fun, str) or isinstance(delta_res, str):
raise NotImplementedError(
"Adaptive FD step sizes are not implemented yet.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, why it is getting sexy recently having a comma in the end of anything recently (My taste remains old-school there... :D) but if we do that, then I would do it consistently (e.g. two lines below as well...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean in f"Method must be one of {methods}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha it is recommended because it allows modifying less lines when e.g. adding additional arguments. true, adding one below.

Comment on lines 31 to 35
Derivative method for the gradient.
hess:
Derivative method for the Hessian
sres:
Derivative method for the residual sensitivities.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I do not completely get from that, what e.g. grad=True or grad=None would mean. Where would I get the gradient from in these cases? The objective or FD? And also, why is the Derivative method a boolean value? :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{True, False, None} here is a ternary variable: None means whatever is available, True means always FD, False never. Documented that a few lines above, will add hints.

Comment on lines +148 to +149
# This is the main method to overwrite from the base class, it handles
# and delegates the actual objective evaluation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> doc string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would rather not use docstring here, as the docstring should be the one from the base class, this here is just a hint to the programmer.

Comment on lines +339 to +346
elif self.method == FD.FORWARD:
f2p = f_fval(x + 2 * delta1)
fc = f_fval(x + delta1)
f2m = fval
elif self.method == FD.BACKWARD:
f2p = fval
fc = f_fval(x - delta1)
f2m = f_fval(x - 2 * delta1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so sure about that, doesn't that essentially approximate the 2nd derivative at x+delta? But I guess that is what the forward/backward mode does for the 1st derivative as well, so should be fine...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is (f'(x) - f'(x-d)) / d \approx ( (f(x) - f(x-d)) / d - (f(x-d) - f(x-2d)) / d) / d, so yes should be correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is (f'(x) - f'(x-d)) / d \approx ( (f(x) - f(x-d)) / d - (f(x-d) - f(x-2d)) / d) / d, so yes should be correct.

Comment on lines +391 to +405
if self.method == FD.CENTRAL:
fpp = f_fval(x + delta1 / 2 + delta2 / 2)
fpm = f_fval(x + delta1 / 2 - delta2 / 2)
fmp = f_fval(x - delta1 / 2 + delta2 / 2)
fmm = f_fval(x - delta1 / 2 - delta2 / 2)
elif self.method == FD.FORWARD:
fpp = f_fval(x + delta1 + delta2)
fpm = f_fval(x + delta1 + 0)
fmp = f_fval(x + 0 + delta2)
fmm = fval
elif self.method == FD.BACKWARD:
fpp = fval
fpm = f_fval(x + 0 - delta2)
fmp = f_fval(x - delta1 + 0)
fmm = f_fval(x - delta1 - delta2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, there is the possibility to take the f_fval(x+delta1 + delta2) here and elsewhere (so drop the /2 everywhere) and instead go for (fpp - fpm - fmp + fmm) / (4 * delta1_val * delta2_val). This would be consistent with the diagonal elements...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though now I am confused, why there is no 4* in the diagonals...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the step size should always be delta, i.e. from x - delta/2 to x + delta/2 (central), or from x to x + delta (forward). I think that should be consistent ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the matlab implementation, it used a stepsize of delta for the off-diagonals, and delta/2 on the diagonals, which may be not the most consistent ... even though one would save a few evaluations when simultaneously computing gradients.

pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
pypesto/objective/finite_difference.py Outdated Show resolved Hide resolved
Boolean indicating whether mode is supported
"""
raise NotImplementedError()
if (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about that in another PR but shouldn't we implement something that checks that the sensi_orders are even valid? right now if we would call check_sensi_orders(sensi_orders=(3,)) it would return True. How about adding an if(max(sensi_orders)>2): raise ValueError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would make sense

@yannikschaelte yannikschaelte merged commit 0e61e0f into develop May 17, 2021
@yannikschaelte yannikschaelte deleted the fixes_ys branch May 17, 2021 13:40
@yannikschaelte yannikschaelte mentioned this pull request May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants