-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trust-region optimization #161
Comments
Indeed, trust-region methods would be nice to have but I didn't get the 'compared with 2nd order counterparts' comment because in my view trust region is mainly an alternative to line search. For instance, it would be great to have trustncg (Newton-CG + Trust region), which is an alternative to ncg (Newton-CG + Line search). |
That's great to hear! Re my comparison: I'm far from an expert, and my limited understanding was that Newton-CG + line search provides a direction first (Hessian- or approximate Hessian-modified gradient) and then tries to determine the magnitude using line search (or some fixed rate), whereas trust region first limits the magnitude based on the currently accepted trust radius, and then finds an appropriate direction which may or may not coincide with ncg (depending if optimal step is inside the ball). If things are convex then they should ultimately find the same optima, but for instances like variance components, which (I believe) are generally non-convex, trust-based approaches seem to be a bit better behaved. |
This may be dead but through a minor object-oriented modification of OSQP you can support arbitrary convex spaces for which the support function as well as (projection onto) the recession and normal cones can be easily computed, which is most simple convex sets of interest (see Banjac et al. (2019) "Infeasibility Detection in the Alternating Direction Method |
Hi all,
Thanks for the awesome library. It would be fantastic to eventually see some options for trust-region optimization. For smaller dimensional problems in my field (e.g., variance components, or their generalization), it provides a much more robust approach for inference compared with 2nd order counterparts.
The text was updated successfully, but these errors were encountered: