Skip to content
This repository has been archived by the owner on Nov 25, 2018. It is now read-only.

diff: Add finite difference approximation of the Hessian #23

Closed
vladimir-ch opened this issue Mar 22, 2016 · 3 comments
Closed

diff: Add finite difference approximation of the Hessian #23

vladimir-ch opened this issue Mar 22, 2016 · 3 comments

Comments

@vladimir-ch
Copy link
Member

No description provided.

@vladimir-ch
Copy link
Member Author

The considerations below apply to approximating the Hessian directly from function values. There should definitely be also another function that approximates the Hessian from the analytical gradient.

I've been looking at this and it's not as straightforward as I thought. I'm not sure if we really want to allow generic formulas for computing the Hessian. With generic formulas the implementation would have to be quite complicated especially if we tried to avoid repeated evaluations of f at the same location for the diagonal. Consider the forward formula in 2D, that is, forward difference of two forward differences. For off-diagonal elements, it leads to the evaluation at points:

[0,h]..[h,h]
  .  ..  .
[0,0]..[h,0]

while for the diagonal elements

[0,0]..[h,0]..[2h,0]

with the coefficient at [h,0] being -2. If we didn't take care, we would evaluate twice at [h,0] instead of once and multiplying by -2. Even worse, it would be better to use the formula but shifted one step to the left. For general formulas, we would have to detect such situations in a general way for little benefit.

Alternatively, we could specify two formulas, but that brings its own complications (e.g., which dimension gets which formula and related symmetry of the Hessian, two origins to identify, API, ...). (Even in the one-formula case there won't be symmetry because of the floating point. We will have to ignore it.)

Also, the number of function evaluations would become very hard to predict for the caller, see #29.

Based on these considerations, I propose to simply fix the formula to Forward. Clear, efficient, simple to implement, takes advantage of OriginKnown, and the number of evaluations is more or less predictable. That's one point.

Another point is, in case you approve this fixed-formula approach, @btracey , how to design the signature. Formula is passed in Settings. The options are:

  1. Require the formula to be nil and panic otherwise.
  2. Ignore the formula.
  3. Remove the formula from Settings and have it as a separate parameter in Derivative and Gradient.
    I prefer 3.

@vladimir-ch
Copy link
Member Author

Have you had time to give this a thought, @btracey ?

There is another fourth option:
4. Have separate HessianSettings with two formulas. They would default to Backward and Forward because the Central2nd is in fact a backward difference of two forward differences. We would still have to determine how their combination at the diagonal looks like in the general case in order to minimize function evaluations. But how many reasonable pairs of formulas there are for this?

I have implemented the fixed-formula case and unlike the general two-formula case the implementation is pleasingly simple. On the other hand in the fixed-formula case the off-diagonal elements are of limited accuracy.

@btracey
Copy link
Member

btracey commented Mar 31, 2016

I think it's reasonable to remove the formula from Settings. This matches how optimize is.

I agree that we should have "hessian from analytic derivative", though the actual code for that will be almost identical to Jacobian, except the hessian is symmetric while the Jacobian doesn't have to be.

@btracey btracey changed the title Add finite difference approximation of the Hessian diff: Add finite difference approximation of the Hessian Mar 16, 2017
@btracey btracey closed this as completed Jul 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants