-
Notifications
You must be signed in to change notification settings - Fork 4
diff: Add finite difference approximation of the Hessian #23
Comments
The considerations below apply to approximating the Hessian directly from function values. There should definitely be also another function that approximates the Hessian from the analytical gradient. I've been looking at this and it's not as straightforward as I thought. I'm not sure if we really want to allow generic formulas for computing the Hessian. With generic formulas the implementation would have to be quite complicated especially if we tried to avoid repeated evaluations of f at the same location for the diagonal. Consider the forward formula in 2D, that is, forward difference of two forward differences. For off-diagonal elements, it leads to the evaluation at points:
while for the diagonal elements
with the coefficient at Alternatively, we could specify two formulas, but that brings its own complications (e.g., which dimension gets which formula and related symmetry of the Hessian, two origins to identify, API, ...). (Even in the one-formula case there won't be symmetry because of the floating point. We will have to ignore it.) Also, the number of function evaluations would become very hard to predict for the caller, see #29. Based on these considerations, I propose to simply fix the formula to Forward. Clear, efficient, simple to implement, takes advantage of OriginKnown, and the number of evaluations is more or less predictable. That's one point. Another point is, in case you approve this fixed-formula approach, @btracey , how to design the signature. Formula is passed in Settings. The options are:
|
Have you had time to give this a thought, @btracey ? There is another fourth option: I have implemented the fixed-formula case and unlike the general two-formula case the implementation is pleasingly simple. On the other hand in the fixed-formula case the off-diagonal elements are of limited accuracy. |
I think it's reasonable to remove the formula from Settings. This matches how optimize is. I agree that we should have "hessian from analytic derivative", though the actual code for that will be almost identical to Jacobian, except the hessian is symmetric while the Jacobian doesn't have to be. |
No description provided.
The text was updated successfully, but these errors were encountered: