Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hessian-vector products #356

Closed
jeff-regier opened this issue Jan 31, 2017 · 11 comments · Fixed by #416
Closed

Hessian-vector products #356

jeff-regier opened this issue Jan 31, 2017 · 11 comments · Fixed by #416

Comments

@jeff-regier
Copy link
Contributor

Have you all considered adding optimization methods that make use of Hessian-vector products, but that don't explicitly form Hessians? I've been thinking about writing a version of newton_trust_region that does that, essentially using conjugate gradient iterations to multiply the gradient by the inverse Hessian. Is that something you'd be interested in including in Optim.jl?

jeff-regier/Celeste.jl#380

@mlubin
Copy link
Contributor

mlubin commented Jan 31, 2017

@dpo and @abelsiqueira have been working on this, not sure if the code is public.

@pkofod
Copy link
Member

pkofod commented Jan 31, 2017

Well, my initial reaction would be: sure. I'm still very happy about the "old" trust region (I've been using it in my personal projects), and you're doing a good job keeping it up to date. That last part is relevant, because even if we have more active people on here than earlier, new solvers is kind of a... tricky issue. On one had we might be tempted to say "let's implement everything", on the other hand we need someone to maintain all that code. So if you're willing to make a prototype, test, and maintain it, I'm all for it.

@jeff-regier
Copy link
Contributor Author

Sounds good, I think I'm up for it. What kind of interface do you suggest? Would we add a field named something like hv to TwiceDifferentiableFunction, so that a user may specify a function that returns the product of the Hessian and a vector?

If the user doesn't specify an hv field, it'd be nice if it was automatically populated by ForwardDiff.jl: calculating the gradient with dual numbers with perturbations set to v (the user-specified vector) would, I think, be a pretty efficient way to compute the product. Is that something you've implemented, perhaps in code that isn't public yet, or that you have experience with?

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one? That, in combination with ForwardDiff.jl for the Hessian-vector product, would be really useful for us at Celeste.jl, and probably for any number of other projects too.

@abelsiqueira
Copy link

Hello, thanks for the mention, Miles.

We have implemented a Matrix-Free Trust-Region Newton Method for unconstrained minimization, i.e., using Hessian-Vector products. We haven't made a released yet, but it is usable: https://github.com/JuliaSmoothOptimizers/Optimize.jl/blob/master/src/solver/trunk.jl.
The package uses NLPModels.jl, LinearOperators.jl and Krylov.jl to implement this. If you'd rather implement this here, you should consider at least LinearOperators and Krylov.

Both me and Dominique are a little swamped at the moment, but implementing competitive Matrix-Free methods for Large Scale problems is one of our goals.

@anriseth
Copy link
Contributor

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one?

I was hoping to take a look at ReverseDiff AD at some point

@pkofod
Copy link
Member

pkofod commented Mar 14, 2017

Also, any plans to use @jrevels's ReverseDiff.jl library to automatically populate the gradient function, if a user doesn't specify one? That, in combination with ForwardDiff.jl for the Hessian-vector product, would be really useful for us at Celeste.jl, and probably for any number of other projects too.

@jeff-regier we've got basic ReverseDiff support now if you dare try master. Do note, that there are quite a few breaking changes, soo...

@jeff-regier
Copy link
Contributor Author

jeff-regier commented Mar 14, 2017

That's good. I've got an implementation of Hessian-free trust-region optimization now over at Celeste.jl, on the jcr/cg2 branch:

https://github.com/jeff-regier/Celeste.jl/blob/jr/cg2/src/cg_trust_region.jl

We're still testing it. The code is solid but I'd like to build in support for preconditioning still.

@pkofod
Copy link
Member

pkofod commented Mar 14, 2017

Are you aware of our preconditioning code in optim ?

@jeff-regier
Copy link
Contributor Author

I wasn't, but now I see precon.jl. Thanks for pointing it out. I'll try to stick with that preconditioner interface so it's easy to merge.

@pkofod
Copy link
Member

pkofod commented Mar 15, 2017

Great, it's the work of @cortner who's also been using it in his work, but of course suggestions and improvements are still welcome.

@cortner
Copy link
Contributor

cortner commented May 22, 2017

Maybe this is obvious to you: in the TR context, the preconditioner doesn't just give you a PCG method, but also specifies the TR topology. Then, when you git the TR boundary you start solving a generalised eval problem instead of the standard eval problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants