Hessian-vector products #356

jeff-regier · 2017-01-31T04:59:36Z

Have you all considered adding optimization methods that make use of Hessian-vector products, but that don't explicitly form Hessians? I've been thinking about writing a version of newton_trust_region that does that, essentially using conjugate gradient iterations to multiply the gradient by the inverse Hessian. Is that something you'd be interested in including in Optim.jl?

jeff-regier/Celeste.jl#380

The text was updated successfully, but these errors were encountered:

mlubin · 2017-01-31T18:26:53Z

@dpo and @abelsiqueira have been working on this, not sure if the code is public.

pkofod · 2017-01-31T18:39:21Z

Well, my initial reaction would be: sure. I'm still very happy about the "old" trust region (I've been using it in my personal projects), and you're doing a good job keeping it up to date. That last part is relevant, because even if we have more active people on here than earlier, new solvers is kind of a... tricky issue. On one had we might be tempted to say "let's implement everything", on the other hand we need someone to maintain all that code. So if you're willing to make a prototype, test, and maintain it, I'm all for it.

jeff-regier · 2017-01-31T19:20:40Z

Sounds good, I think I'm up for it. What kind of interface do you suggest? Would we add a field named something like hv to TwiceDifferentiableFunction, so that a user may specify a function that returns the product of the Hessian and a vector?

If the user doesn't specify an hv field, it'd be nice if it was automatically populated by ForwardDiff.jl: calculating the gradient with dual numbers with perturbations set to v (the user-specified vector) would, I think, be a pretty efficient way to compute the product. Is that something you've implemented, perhaps in code that isn't public yet, or that you have experience with?

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one? That, in combination with ForwardDiff.jl for the Hessian-vector product, would be really useful for us at Celeste.jl, and probably for any number of other projects too.

abelsiqueira · 2017-01-31T19:28:51Z

Hello, thanks for the mention, Miles.

We have implemented a Matrix-Free Trust-Region Newton Method for unconstrained minimization, i.e., using Hessian-Vector products. We haven't made a released yet, but it is usable: https://github.com/JuliaSmoothOptimizers/Optimize.jl/blob/master/src/solver/trunk.jl.
The package uses NLPModels.jl, LinearOperators.jl and Krylov.jl to implement this. If you'd rather implement this here, you should consider at least LinearOperators and Krylov.

Both me and Dominique are a little swamped at the moment, but implementing competitive Matrix-Free methods for Large Scale problems is one of our goals.

anriseth · 2017-01-31T20:45:34Z

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one?

I was hoping to take a look at ReverseDiff AD at some point

pkofod · 2017-03-14T11:10:13Z

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one? That, in combination with ForwardDiff.jl for the Hessian-vector product, would be really useful for us at Celeste.jl, and probably for any number of other projects too.

@jeff-regier we've got basic ReverseDiff support now if you dare try master. Do note, that there are quite a few breaking changes, soo...

jeff-regier · 2017-03-14T18:18:51Z

That's good. I've got an implementation of Hessian-free trust-region optimization now over at Celeste.jl, on the jcr/cg2 branch:

https://github.com/jeff-regier/Celeste.jl/blob/jr/cg2/src/cg_trust_region.jl

We're still testing it. The code is solid but I'd like to build in support for preconditioning still.

pkofod · 2017-03-14T19:15:47Z

Are you aware of our preconditioning code in optim ?

jeff-regier · 2017-03-15T05:48:10Z

I wasn't, but now I see precon.jl. Thanks for pointing it out. I'll try to stick with that preconditioner interface so it's easy to merge.

pkofod · 2017-03-15T06:23:55Z

Great, it's the work of @cortner who's also been using it in his work, but of course suggestions and improvements are still welcome.

cortner · 2017-05-22T20:49:36Z

Maybe this is obvious to you: in the TR context, the preconditioner doesn't just give you a PCG method, but also specifies the TR topology. Then, when you git the TR boundary you start solving a generalised eval problem instead of the standard eval problem.

jeff-regier mentioned this issue May 21, 2017

WIP: CG/Krylov Trust Region method #416

Merged

pkofod closed this as completed in #416 Oct 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hessian-vector products #356

Hessian-vector products #356

jeff-regier commented Jan 31, 2017

mlubin commented Jan 31, 2017

pkofod commented Jan 31, 2017

jeff-regier commented Jan 31, 2017

abelsiqueira commented Jan 31, 2017

anriseth commented Jan 31, 2017

pkofod commented Mar 14, 2017

jeff-regier commented Mar 14, 2017 •

edited

Loading

pkofod commented Mar 14, 2017

jeff-regier commented Mar 15, 2017

pkofod commented Mar 15, 2017

cortner commented May 22, 2017

Hessian-vector products #356

Hessian-vector products #356

Comments

jeff-regier commented Jan 31, 2017

mlubin commented Jan 31, 2017

pkofod commented Jan 31, 2017

jeff-regier commented Jan 31, 2017

abelsiqueira commented Jan 31, 2017

anriseth commented Jan 31, 2017

pkofod commented Mar 14, 2017

jeff-regier commented Mar 14, 2017 • edited Loading

pkofod commented Mar 14, 2017

jeff-regier commented Mar 15, 2017

pkofod commented Mar 15, 2017

cortner commented May 22, 2017

jeff-regier commented Mar 14, 2017 •

edited

Loading