-
-
Notifications
You must be signed in to change notification settings - Fork 100
Improve the OptimizationManopt.jl interface #1009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the OptimizationManopt.jl interface #1009
Conversation
| # TODO: WHY? they both still accept not passing it | ||
| function SciMLBase.requireshessian(opt::Union{ | ||
| AdaptiveRegularizationCubicOptimizer, TrustRegionsOptimizer}) | ||
| true | ||
| end | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this function defined and what is it for?
The current definition here is not correct, both ARC and TR can perform their own (actually quite good) approximation of the hessian – similar to what QN does.
So they do not need a Hessian, but the exact one of course performs a bit better than the approximate one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a trait for checking whether a solver requires that the Hessian function is required in order to use the solver. For example, if your solver uses prob.f.hess then this should be true, so that way you can fail if a second order AD method is not given.
if it's not required then this should be false. What this will do is, if true, turn on an error message that says "prob.f.hess is not defined and therefore you cannot use this method" (not exactly, but high level that's pretty much what it's for, for higher level error messages and reporting)
|
Currently:
|
| # TODO: I do not understand this. Why is the manifold not used? | ||
| # Either this is an Euclidean cost, then we should probably still call `embed`, | ||
| # or it is not, then we need M. | ||
| return function (::AbstractManifold, θ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we should check what best to do, the current one works in some cases, but not all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I don't know. Do you need to know the manifold to know how to calculate the loss? I guess to know the mapping for some parameter values in some representations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature of cost/grad/hess always has the first parameter as the manifold, since it allows to implement several costs for arbitrary manifolds, e..g. the Karcher mean to minimise the distances squared.
My main problem is that I do not understand which cost that is
- on a manifold it would contradict what the gradient function next does
- in the embedding we would have to call
embed(M, θ)before passing it to the function f that is defined in the embedding.
as long as embed is the identity, like for SPDs and the sphere the current code works. But for fixed rank it for example would not work.
|
@ChrisRackauckas @Vaibhavdixit02 I think I did nearly all I could do here regarding the questions above.
From the Manopt.jl side all has been addressed – even the convergence should now perform much better, since we introduced a |
|
Ha! Found the bug! Someone (before me) implemented one Riemannian Hessian in the wrapper using So tests run fine. I would still prefer to get answers to the questions above. |
# Conflicts: # lib/OptimizationManopt/Project.toml
|
As a last step for today I tried generating the docs locally, but it fails on other-solvers-examples like NOMAD and MOI – so I can not check how they would look like. Also there are 5 broken links that error and 🤨 |
# Conflicts: # docs/Project.toml
|
Rebase onto latest master so you get the test fixes. |
| end | ||
|
|
||
| # cf. https://github.com/SciML/SciMLBase.jl/blob/master/src/problems/optimization_problems.jl | ||
| # {iip} is the parameter here – nowhere explained but very much probably “is in place” |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's defined in the SciMLBase interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me very long to find that. But it is your package so I do not have to maintain the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is what you are looking for instead a comment that says # {true} denotes the dispatch is in-place? That's fine. See the other comment thread, I think I may have misinterpreted what you had meant with this stream of text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because that is no-where documented, so if it is at least a comment here with source, that might help future people looking at this. As I said I spent about an hour finding that out, more like by looking at how it is used.
I can not write all your missing docs, but I can help code that uses the undocumented things to maybe be more readable. This change here indicated for me, code should not be readable?
|
Looks reasonable and is passing tests. I left comments for each of the questions in there. |
|
Originally I wanted to check the examples and provide nicer ones, but this was now rushed that fast, that it took only about a third of my grandfathers 93rd birthday cake and dinner to be merged. Now I feel a bit demotivated to do a next PR and check that all again and look at the examples. The current ones are – mildly phrased – technical and too complicated. There is much easier ones for new users to see how Manopt could be used. edit: For example – the starting point of this PR was to fix that all solver runs still always report FAILURE in https://docs.sciml.ai/Optimization/stable/optimization_packages/manopt/ - I had hoped my code did that, but I am still unsure how and where that is decided – because again it is only mildly documented if at all how that return code is determined. Probably it is somewhere in the 150+ packages, but a mere muggle like me has no change finding that. But ok. In this rushed development here, maybe the time to find out that things work is also not so wanted. That is my feeling, sadly. |
|
Hey first of all, calm down, take a deep breath. I don't see what the controversy of all of this is, but we can take this in steps. It is becoming very difficult to weave through the passive aggressive comments here to find out what exactly is being discussed. For the purpose of changing the tone of this going forward, I'm just going to give the benefit of the doubt and respond faithfully to every line, and I hope you will give the benefit of the doubt similarly in the future as well.
I'm not entirely sure what was rushed here. If you are truly feeling rushed, I am sorry that is not the intention. For this PR I received multiple pings and slack messages about not being attentive to this. I assumed that this meant that I needed to move faster on this because you wanted to get this done quicker. Sorry if I had misinterpreted that.
Normally it helps the velocity of the repo and the simplicity of the process to "do one thing per PR". It wasn't clear to me after the last comments there that the intention was also to put the example change in this PR. I don't think that it would be necessary to do it in this PR because that example should at least work if the package is working again. While I would agree that "the SPD example is too complicated and bogus" as a first example for solving optimizations on manifolds, the first problem of this type is usually something simple like "optimize on a sphere!" etc., writing a new tutorial that improves the learning is pretty disconnected from just getting the package up and running on latest versions, and so normal development practices would split those into separate PRs. It doesn't necessarily have to, but it since that's the norm I had assumed that was the direction this was moving since the update process turned into a month long slog instead of something quick. And with the pings from you and @oscardssmith about unblocking latest versions around ForwardDiff and the autodiff channel discussions on the topic (or wherever that was), I had though "oh I am behind, and everyone wants this in, let me make sure I get tests fixed so this can rebase and get in, and then we continue in the next step". The goal was to reduce the maintenance burden by breaking this down to simple problems and taking it step-by-step, getting the critical bug fixes and version bumps out there ASAP once tests are passing due to requests from the ecosystem, and then follow up with next steps. I am sorry if that was not clearly communicated. So in going forward, I would suggest we do different steps
It would be due to the return code handling. You can see the solution struct building in https://github.com/SciML/SciMLBase.jl/blob/master/src/solutions/optimization_solutions.jl#L84-L135. The default should be ReturnCode.Default in the constructor, but right here https://github.com/SciML/Optimization.jl/blob/v4.8.0/lib/OptimizationManopt/src/OptimizationManopt.jl#L460 it's overridden so that it's always Success or Failure. I am not sure why
All of this is contained within only this repo (notice that OptimizationBase is a part of this Github repository) or SciMLBase, where SciMLBase defines the global interfaces for all of SciML. Indeed, everything asked in this PR is documented there, such as the in-place specification: https://docs.sciml.ai/SciMLBase/stable/interfaces/Problems/#In-place-Specification the algorithm trait system https://docs.sciml.ai/SciMLBase/stable/interfaces/Algorithms/ and the return code specification https://docs.sciml.ai/SciMLBase/stable/interfaces/Solutions/#retcodes Notably the optimizer algorithm traits are missing from the specification though (and shouldn't be optimizer specific...) and so that is something I will have to clean up as another contributor did not properly define the interfaces here. This is part of the greater Optimization.jl clean up that is in progress. Note that I have actually wanted to pull the repo together, but other contributors had mentioned in SciML/DifferentialEquations.jl#1082 that they wanted to see SciMLBase kept separate, so the architecture we're going towards by popular demand is to have just SciMLBase as top level interfaces with each tower then being contained, so one repo for all differential equations, one repo for all of optimization, etc. So it should only generally be 2 required, where SciMLBase only comes into play when dealing with the general multi-repo pieces (i.e. interface disconnected from solver implementation).
We are happy to have you here. Sorry if you feel rushed. But instead of reverting, I think at this point what would be best would be to keep moving forward, and one of the major things to do here is to divy this up into discrete chunks of work in order to invite more people to participate in the action. I will help out here by taking the initiative to turn the remaining points I see into documented issues, and will start to look at this after JuliaCon Paris (of course today is JuliaCon Paris and I just had a 28 hour flight from Australia to get here, so sorry if I'm a bit less responsive). I'll create the issues and let me know if any topic is missing. |
|
Other comments addressed in the additional threads. |
|
Thanks for the explanations, I feel the idea to work here is maybe to different from how I do that usually. My PRs are slow, never rushed, aim to provide thorough code and well-documented code. Here I think I feel too rushed when things are merged that fast. This PR was meant to do one thing: Fix the |
I sat down for about an hour and looked more seriously at the interface.
The old form had a bit too much of clutter in.
This should close #906, #943, and #944. We could also check #814 on this branch a bit closer.
has_convergedJuliaManifolds/Manopt.jl#511 (to be merged soon), we can even resolve that currently Optimization.jl always claims solver runs with Manopt failed./cc @Vaibhavdixit02 @oscardssmith (maybe @ChrisRackauckas ?)