New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hessian-vector products #356

Closed
jeff-regier opened this Issue Jan 31, 2017 · 11 comments

Comments

Projects
None yet
6 participants
@jeff-regier
Contributor

jeff-regier commented Jan 31, 2017

Have you all considered adding optimization methods that make use of Hessian-vector products, but that don't explicitly form Hessians? I've been thinking about writing a version of newton_trust_region that does that, essentially using conjugate gradient iterations to multiply the gradient by the inverse Hessian. Is that something you'd be interested in including in Optim.jl?

jeff-regier/Celeste.jl#380

@mlubin

This comment has been minimized.

Contributor

mlubin commented Jan 31, 2017

@dpo and @abelsiqueira have been working on this, not sure if the code is public.

@pkofod

This comment has been minimized.

Collaborator

pkofod commented Jan 31, 2017

Well, my initial reaction would be: sure. I'm still very happy about the "old" trust region (I've been using it in my personal projects), and you're doing a good job keeping it up to date. That last part is relevant, because even if we have more active people on here than earlier, new solvers is kind of a... tricky issue. On one had we might be tempted to say "let's implement everything", on the other hand we need someone to maintain all that code. So if you're willing to make a prototype, test, and maintain it, I'm all for it.

@jeff-regier

This comment has been minimized.

Contributor

jeff-regier commented Jan 31, 2017

Sounds good, I think I'm up for it. What kind of interface do you suggest? Would we add a field named something like hv to TwiceDifferentiableFunction, so that a user may specify a function that returns the product of the Hessian and a vector?

If the user doesn't specify an hv field, it'd be nice if it was automatically populated by ForwardDiff.jl: calculating the gradient with dual numbers with perturbations set to v (the user-specified vector) would, I think, be a pretty efficient way to compute the product. Is that something you've implemented, perhaps in code that isn't public yet, or that you have experience with?

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one? That, in combination with ForwardDiff.jl for the Hessian-vector product, would be really useful for us at Celeste.jl, and probably for any number of other projects too.

@abelsiqueira

This comment has been minimized.

abelsiqueira commented Jan 31, 2017

Hello, thanks for the mention, Miles.

We have implemented a Matrix-Free Trust-Region Newton Method for unconstrained minimization, i.e., using Hessian-Vector products. We haven't made a released yet, but it is usable: https://github.com/JuliaSmoothOptimizers/Optimize.jl/blob/master/src/solver/trunk.jl.
The package uses NLPModels.jl, LinearOperators.jl and Krylov.jl to implement this. If you'd rather implement this here, you should consider at least LinearOperators and Krylov.

Both me and Dominique are a little swamped at the moment, but implementing competitive Matrix-Free methods for Large Scale problems is one of our goals.

@anriseth

This comment has been minimized.

Contributor

anriseth commented Jan 31, 2017

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one?

I was hoping to take a look at ReverseDiff AD at some point

@pkofod

This comment has been minimized.

Collaborator

pkofod commented Mar 14, 2017

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one? That, in combination with ForwardDiff.jl for the Hessian-vector product, would be really useful for us at Celeste.jl, and probably for any number of other projects too.

@jeff-regier we've got basic ReverseDiff support now if you dare try master. Do note, that there are quite a few breaking changes, soo...

@jeff-regier

This comment has been minimized.

Contributor

jeff-regier commented Mar 14, 2017

That's good. I've got an implementation of Hessian-free trust-region optimization now over at Celeste.jl, on the jcr/cg2 branch:

https://github.com/jeff-regier/Celeste.jl/blob/jr/cg2/src/cg_trust_region.jl

We're still testing it. The code is solid but I'd like to build in support for preconditioning still.

@pkofod

This comment has been minimized.

Collaborator

pkofod commented Mar 14, 2017

Are you aware of our preconditioning code in optim ?

@jeff-regier

This comment has been minimized.

Contributor

jeff-regier commented Mar 15, 2017

I wasn't, but now I see precon.jl. Thanks for pointing it out. I'll try to stick with that preconditioner interface so it's easy to merge.

@pkofod

This comment has been minimized.

Collaborator

pkofod commented Mar 15, 2017

Great, it's the work of @cortner who's also been using it in his work, but of course suggestions and improvements are still welcome.

@cortner

This comment has been minimized.

Contributor

cortner commented May 22, 2017

Maybe this is obvious to you: in the TR context, the preconditioner doesn't just give you a PCG method, but also specifies the TR topology. Then, when you git the TR boundary you start solving a generalised eval problem instead of the standard eval problem.

@pkofod pkofod closed this in #416 Oct 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment