Replies: 52 comments 120 replies
-
Interesting.
this yields exactly the same numbers as your gradient, but on other manifolds that R2 (your manifold is essentially Then, to see a little more what is happening (though I saw what the reason is directly) you can do xMean = gradient_descent(M, F, gradF, x0; debug= [:Iteration, :Cost,"\n",10]) to print a line of the iteration and the cost (allowed by a newline) every 10th iteration. If you then look for the default values of the gradient descent, you can see that it works with a constant step size by default (machine learning people call that learning rate), which is not guaranteed to converge (ever). But you can exchange that, see https://manoptjl.org/stable/plans/index.html#Stepsize-1 with a step size that might fit better, for example the Armijo step size rule, its defaults are chosen to be just fine, so julia> xMean = gradient_descent(M, F, gradF, x0; stepsize=ArmijoLinesearch(), debug= [:Iteration, :Cost,"\n",10])
InitialF(x): 2.831969279439222
# 10F(x): 0.2578005738010819
# 20F(x): 0.2000672051680747
# 30F(x): 0.20000006190528008
# 40F(x): 0.20000000005719792
# 50F(x): 0.2000000000000529
# 60F(x): 0.20000000000000012
2-element MVector{2, Float64} with indices SOneTo(2):
1.0000197125126835
0.9999999998505151 is this close enough? If not, you can tweak the stopping criterion, see https://manoptjl.org/stable/solvers/index.html#StoppingCriteria-1 and note that you can combine stopping criteria just with julia> xMean = gradient_descent(M, F, gradF, x0;
stepsize=ArmijoLinesearch(),
stopping_criterion=StopAfterIteration(200) | StopWhenGradientNormLess(10.0^-12),
debug=[:Iteration, :Cost,"\n",25]
)
InitialF(x): 2.831969279439222
# 25F(x): 0.2000020367614238
# 50F(x): 0.2000000000000529
2-element MVector{2, Float64} with indices SOneTo(2):
1.0000197125126835
0.9999999999999954 Let me know if this solves your problem, One could for example change the default to use Armijo - though that has per default quite some function evaluations to determine the step size. |
Beta Was this translation helpful? Give feedback.
-
Adendum, after clicking on your link – it seems the allocate in ManifoldsBase does something strange in NelderMead!(M.manifold, F, [random_point(M) for _=1:3]) it does something. Note that Nelder Mead is not a very efficient algorithm and should only be used if you do not have any gradient. edit: sorry my bad :) I do not work with NelderMead usually. You should call also NelderMead(M.manifold, F, [random_point(M) for _=1:3]) just works fine, too as does NelderMead(M.manifold, F) But note that its convergence is veeeeeery slow (the default 2000 iterations are not enough). |
Beta Was this translation helpful? Give feedback.
-
Hi Ronny Johan, thanks this helps me follow better too.
hmm, makes me curious about #64 again. I'll check when I get a chance and comment there if something shows up. |
Beta Was this translation helpful? Give feedback.
-
No, that was my mistake at first glance – the error is that you have to provide a population not a starting point. Then everything works :) |
Beta Was this translation helpful? Give feedback.
-
Oh and a final look, since you used BFGS in Optim. We have that on manifolds, too, just use julia> quasi_Newton(M, F, gradF, x0)
2-element MVector{2, Float64} with indices SOneTo(2):
1.0204551069445331
1.000000000000014 whose default is the (inverse) BFGS here, too. |
Beta Was this translation helpful? Give feedback.
-
Ah, got it thanks! |
Beta Was this translation helpful? Give feedback.
-
Perfect. Let me know when you think a default should be adapted or the docs should be changed (for example at first glance Stopping Criteria and Step Size rules could be made a little more prominent). |
Beta Was this translation helpful? Give feedback.
-
Maybe just a |
Beta Was this translation helpful? Give feedback.
-
Where? The default step size is mentioned at every algorithm as are the default sopping criteria. |
Beta Was this translation helpful? Give feedback.
-
I made a PR on where I would add it... Feel free to ignore if that does not fit/work. Reason for putting it there is that when following a design template example for the first time, one expects the defaults to work (regardless of performance). When I get stuck on the first example, I find myself scrolling to the bottom of the "Getting Started" page hoping to find the obvious stumbling blocks. FAQ is second best place (but can be a discouraging search). Once a user sees their own first example working (different than just copy paste), then you start to invest more time to read deeper into all the docs. Training wheels first with absolute minimum details to get going, then scale up more and more on details from there. |
Beta Was this translation helpful? Give feedback.
-
Manopt.jl does not have a FAQ. |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot, I did find stopping criteria and played around with it. However, I expected the default to converge in such a simple example. There are quite a few stumbling blocks for me, but I can already see the potential of Manopt. I know it's a relatively new package and perhaps we can help build out examples.
So, I'm in agreement with this, and that for defaults to rather work (with the possible cost of performance, with a note or something) especially if the user comes from other packages like Optim or JuMP where one can be a user without understanding the underlying algorithm. Perhaps something like this will help a new user like me a lot, from the Optim.jl docs:
|
Beta Was this translation helpful? Give feedback.
-
It's perhaps not part of this issue, but let me list some problems I encounter: Manopt.NelderMead(MP, F, xp)
ERROR: MethodError: no method matching NelderMeadOptions(::ProductRepr{Tuple{ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}}, ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}}}}, ::StopAfterIteration; α=1.0, γ=2.0, ρ=0.5, σ=0.5, retraction_method=ExponentialRetraction(), inverse_retraction_method=LogarithmicInverseRetraction()) or perhaps other errors, eg: julia> xMean = quasi_Newton(M, F_prior, gradF_prior, x; debug= [:Iteration, :Cost,"\n",10])
InitialF(x): 1.4945569745366585
ERROR: vector_transport_to! not implemented on ProductManifold(TranslationGroup(2; field = ℝ), SpecialOrthogonal(2)) for arguments ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}}, ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}}, ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}}, ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}} and ParallelTransport. julia> particle_swarm(M, F_prior; n, x0)
ERROR: ArgumentError: broadcasting requires an assigned BroadcastStyle gradF(MP, xp)
ERROR: MethodError: no method matching fill!(::ProductRepr{Tuple{MVector{2, Float64}, MMatrix{2, 2, Float64, 4}}}, ::Int64) The other issues are more to do with a lack of knowledge coming from "I just want to optimize my cost function", such as
I hope this helps improve the quality of this great package and let me know if there is anywhere I can help. (although I still have a lot to learn) |
Beta Was this translation helpful? Give feedback.
-
For the first and simple example, there is a certain tradeoff, especially on manifolds. We can still, for sure, switch to Armijo, I don‘t have a strong preference there. The summary sounds like a good idea, and holds here completely analogously for Newton with TR, and LBFGS (BFGS uses https://manoptjl.org/stable/solvers/quasi_Newton.html#Manopt.AbstractQuasiNewtonDirectionUpdate switch to https://manoptjl.org/stable/solvers/quasi_Newton.html#Manopt.QuasiNewtonLimitedMemoryDirectionUpdate for the LBFGS one, which might also be good to have a shorter constructor). Additionally, for non smooth functions, if you have probes use CyclicProximalPoint or DouglasRachford (Optim does not have those but there this package has also a little different focus, namely to include non smooth optimisation on manifolds) or if you are adventurous the ChambollePock algorithm from the most recent paper. For the summary – |
Beta Was this translation helpful? Give feedback.
-
Thanks! A summary like that is perfect. I was actually looking for LBFGS, but didn't recognize it until you made it obvious just now 🤦 |
Beta Was this translation helpful? Give feedback.
-
Ahh, thanks. I was working from our factors and missed it since it's done with the Mahalanobis distance in our case. It should be distance squared. I still can't get a satisfactory result though. Perhaps I should take a step back and rather ask how to solve on manifolds such as SE(2), SE(3), SO(2), SO(3). Perhaps start with SE(2) for a POC. #M - manifold
#p and q are points on manifold
#m is a measurement from point p
# meas is the prior measurement
function PointPoint(M, m, p, q)
q̂ = compose(M, p, m)
return distance(M, q, q̂)^2
end
function Prior(M, meas, p)
#
return distance(M, meas, p)^2
end The factors are combined into cost functions that depends on the structure of the factor graph. |
Beta Was this translation helpful? Give feedback.
-
Distance squared is better, that is geodetically convex (so locally it is convex. For distance squared, keep in mind that globally this might still be non-unique. For example on the sphere if you minimise with respect to x the function distance(M, n, x)^2 + distance(M, s, x)^2 Where n and s are North and South Pole, respectively. Then any x on the equator is a minimiser of the function. Your last sentence, I again can‘t follow. what factrors? what is to combine here? Which graph? What is its structure? |
Beta Was this translation helpful? Give feedback.
-
Hi, I found something interesting. If i set |
Beta Was this translation helpful? Give feedback.
-
That is good to know :) Retractions should locally be as fine as exp (inverse retractions as fine as log), but are usually cheaper or numericaly more stable. So good that you checked. |
Beta Was this translation helpful? Give feedback.
-
That is where IncrementalInference.jl comes in, in short, you will get the ring around the equator as a belief if you set it up that way. For example figure 1 in https://marinerobotics.mit.edu/sites/default/files/fourie_iros19_manifolds.pdf has a similar case on a circular manifold. |
Beta Was this translation helpful? Give feedback.
-
Sorry, an explanation will be too much for a comment. Perhaps a basic example: #for a user-defined graph that looks like this:
p -- x1 -- f1 -- x2 -- f2 -- x3
\ /
------- f3 ------
#the cost would look like this:
cost(M, x) = Prior(M, p, x[1])^2 + PointPoint(M, f1, x[1], x[2])^2 +
PointPoint(M, f2, x[2], x[3])^2 + PointPoint(M, f3, x[1], x[3])^2 |
Beta Was this translation helpful? Give feedback.
-
Note that I just turned this into a discussion, since it is not an issue (yes, I also just activated discussion). I hope that is ok |
Beta Was this translation helpful? Give feedback.
-
I found the mistake in Manopt. I'll see if I can fix it in a PR. |
Beta Was this translation helpful? Give feedback.
-
BTW, with the fix, my basic |
Beta Was this translation helpful? Give feedback.
-
Slightly related to this, but I can also open an issue in Manifolds.jl if it fits better there: I'm looking into using the Mahalanobis distance in the cost function, according to (Pennec [15]): It led me to take a closer look at the choice of the inner product (according to docs, the metric from the embedding for SO(n)) and the distance function in Manifolds.jl (on SO(n)). M = Rotations(2)
ϵSO2 = SA[1. 0; 0 1]
p = ϵSO2
X = hat(M, ϵSO2, [pi/4])
q = exp(M, ϵSO2, X)
dR = distance(M, p, q)
#1.1107207345395915
M = Circle()
p = 0
q = pi/4
dC = distance(M, p, q)
#0.7853981633974483
dR * 1/sqrt(2) == dC
|
Beta Was this translation helpful? Give feedback.
-
Hi, I was just curious as to how much work it would be to make our own data structure (Factor Graphs) work with Manopt.jl. The alternative looks like it would be to copy out all the points and their manifold into a new ProductManifold and ProductRepresentation. (this is similar to how we currently work in Optim.jl) |
Beta Was this translation helpful? Give feedback.
-
Ah, with that example it gets more clear, so what you have is that The factors are single real-valued functions and what you want to minimise is their sum? Then one could write their gradients as being to seid product manifold (with corresponding zero tangent vectors for variables that do not appear). If the graph is very sparse on could with of trying to do this sparse, too. |
Beta Was this translation helpful? Give feedback.
-
From Slack discussion on metrics for SE(n):
I don't think our functions are written to work with the group exponent (log_lie) |
Beta Was this translation helpful? Give feedback.
-
Discussion on ADMateusz Baran 3 hours ago Mateusz Baran 3 hours ago Mateusz Baran 3 hours ago Mateusz Baran 3 hours ago Johannes Terblanche 1 hour ago |
Beta Was this translation helpful? Give feedback.
-
Just to give an update on our ongoing upgrade to Manifolds.jl and Manopt.jl. I chose Levenberg Marquardt to start with as it's the most familiar to me and I could use a sparse Jacobian. Related questions: Do you have any recommendations for a Manopt solver to try next? I want to have more than one to choose from and it looks like if I upgrade to support one of the gradient-based methods a few will work in the same way. |
Beta Was this translation helpful? Give feedback.
-
Hi I'm new to Manopt and I'm trying to figure out how to integrate Manopt into IncrementalInference (also see JuliaRobotics/IncrementalInference.jl#1276 for more complex tries).
The optimization is formulated as a factor graph and this example represents a simple factor graph with one variable and 2 priors (so basically the minimum of 2 points). I don't get the results I expect and I made this MWE:
cc @dehann
EDIT: I'm on
[1cead3c2] Manifolds v0.5.4
[0fc0a36d] Manopt v0.3.9
Beta Was this translation helpful? Give feedback.
All reactions