CMA-ES #373

mateuszbaran · 2024-03-29T15:09:46Z

The Covariance Matrix Adaptation-Evolutionary Strategy algorithm adapted to the Riemannian setting.

TODO:

Stopping criteria (NoEffectAxis, NoEffectCoord, EqualFunValues, Stagnation, TolXUp, TolFun, TolX).
More tests.
Check performance.
I'm not sure about the in-place convenience wrapper.

TODO beyond this PR:

spd_matrix_transport_to is useful more generally.
Maybe support vector transport in coordinates.
Parallelization of objective evaluation.

codecov · 2024-03-29T15:20:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.74%. Comparing base (1ea2ac6) to head (8c39dca).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #373      +/-   ##
==========================================
+ Coverage   99.73%   99.74%   +0.01%     
==========================================
  Files          73       74       +1     
  Lines        6876     7167     +291     
==========================================
+ Hits         6858     7149     +291     
  Misses         18       18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/plans/stopping_criterion.jl

kellertuer · 2024-03-30T18:06:43Z

Until now I just have one minor remark: For all algorithms I tried to find speaking names. CPPA is cyclic_proximal_point. ALM and EPM also enjoy their long names,...
but on the other hand I do see that the 5 letter acronym here would really be a bit long if written as a function name. Still, this would be the first shortened one, right? Maybe we can think about a tradeoff?

mateuszbaran · 2024-03-31T09:30:04Z

Yes, I see. Unfortunately the five word name seems a bit too long and I don't have a better idea.

kellertuer · 2024-03-31T10:58:59Z

I totally understand this. That is also why I tried to phrase this as careful as possible.
While for documentation I am fine with math variable letters, in code (unless it is function internal variables) I do prefer a bit more speaking variables, that depend a bit less on “the user already knows”. But for this case I indeed do also not yet see how we could do something a bit longer than cma_es that helps starters a bit without confusing experts (that do understand what cma_es is).

mateuszbaran · 2024-03-31T16:55:59Z

I think those NoEffect stopping criteria aren't particularly important and we already have so much stopping criteria code here. @kellertuer , it would be nice to export all of them but I don't want to pollute the namespace too much. At least TolX and TolFun are things user may want to adjust but they would likely want to keep other criteria too. Stagnation and EqualFunValues are somewhat generic, that is they can in theory be applied to any evolutionary optimization algorithm like the particle swarm we already have here. At least if we adjusted the particle swarm to keep track of some population heuristics. Not something I'd like to do in this PR but could be an idea for the future.

kellertuer

Hm, could maybe some of the stopping criteria be combined into one? Also, currently they have a bit cryptic names, but I can check them out later when they have a bit more description.
For example EqualFunValuesCondition without a reference to CMAES (that is out of context) sounds super random and may even be confused to fit other solvers? I tried to use names StopWhenX names for stopping criteria; then exporting all of them is also not that bad, and code stays relatively readable.

docs/src/solvers/cma_es.md

kellertuer · 2024-03-31T17:35:14Z

At least TolX and TolFun

do you mean stopping criteria like StopWhenChangeLess and StopWhenFunctionDecreaseLess (second does not yet exist I think)? Sure those could have tolerances a user can also set via a keyword directly, if you feel that is reasonable. For 1-2 StoppingCriteria in the default I usually did not pass keywords to them, but if the have 4-5 it might be a good idea to accept keyword parameters to set stopping criteria tolerances.

mateuszbaran · 2024-03-31T18:38:08Z

Hm, could maybe some of the stopping criteria be combined into one? Also, currently they have a bit cryptic names, but I can check them out later when they have a bit more description.
For example EqualFunValuesCondition without a reference to CMAES (that is out of context) sounds super random and may even be confused to fit other solvers? I tried to use names StopWhenX names for stopping criteria; then exporting all of them is also not that bad, and code stays relatively readable.

I used names from appendix B.3 of Hansen's paper. We could surely have better names but they are fairly complex criteria that don't have nice StopWhenX names. StopWhenSigmaTimesMaximumEigenvalueOfCovarianceMatrixGreaterThan?

do you mean stopping criteria like StopWhenChangeLess and StopWhenFunctionDecreaseLess (second does not yet exist I think)?

No, I mean those from Hansen's paper. TolX is specific to CMA-ES and TolFun is specific to evolutionary algorithms.

mateuszbaran · 2024-03-31T18:39:33Z

For 1-2 StoppingCriteria in the default I usually did not pass keywords to them, but if the have 4-5 it might be a good idea to accept keyword parameters to set stopping criteria tolerances.

Sure, that makes sense.

kellertuer · 2024-03-31T19:07:27Z

I used names from appendix B.3 of Hansen's paper. We could surely have better names but they are fairly complex criteria that don't have nice StopWhenX names. StopWhenSigmaTimesMaximumEigenvalueOfCovarianceMatrixGreaterThan?

No. That is too short, we would need something a bit longer I think ;)

We could maybe check for an interpreting name? What does this mean that sigma times the largest eigenvalue of the covariance matrix exceeds some value?

do you mean stopping criteria like StopWhenChangeLess and StopWhenFunctionDecreaseLess (second does not yet exist I think)?

No, I mean those from Hansen's paper. TolX is specific to CMA-ES and TolFun is specific to evolutionary algorithms.

Ah, ok. We then should check that paper for more details and (as above) for interpreting and not-too-long names.
I was not able to work the last few days (fetched a cold, but getting better already), so I will first have to fetch up teaching preparations tomorrow, but besides that I can surely help looking for nice names (as you might have noticed, I do like to do that anyways).

kellertuer · 2024-04-01T07:10:17Z

Here is a few ideas for naming

NoEffectAxis, NoEffectCoord – in the appendix he is a bit short about this, what m actually does, but since this method is basically a mean based one – we could instead of population even use names StopWhenMeanXfor these two? I would have to read the paper more thoroughly for a more detailed proposal here
EqualFunValues – for me this is a stronger statement then TolFun so basically super-concentration?
Stagnation maybe StopWhenCostStagnates?
TolXUp when including a warning about possible too small sigma, this could also be called StopWhenPopulationDiverges?
TolFun This is siilar, but with a longer observation phase than a StopWhenCostDecreaseLess ? Or even StopWhenCostLess? Hansen speaks about range of function values, so one could say StopWhenPopulationCostConcentrated?
TolXI think this is basically StopWhenPopulationConcentrated?

The overview paper, though Euclidean is great, but we could also mentiond https://ieeexplore.ieee.org/document/5299260
Also, the bib entry can be improved to remove the DOI and change the URL to eprint/eprinttype, but I can do that somewhen later as well, but of course also Dreisigmeyer (already in the bib) https://optimization-online.org/wp-content/uploads/2017/03/5916.pdf – I would have to check the right reference still.

mateuszbaran · 2024-04-01T12:52:10Z

NoEffectAxis, NoEffectCoord – in the appendix he is a bit short about this, what m actually does, but since this method is basically a mean based one – we could instead of population even use names StopWhenMeanXfor these two? I would have to read the paper more thoroughly for a more detailed proposal here

I'd skip those two for this PR actually. It seems that they are just for terminating a bit earlier than other conditions would when the search stagnates.

EqualFunValues – for me this is a stronger statement then TolFun so basically super-concentration?

Yes, this is correct.

Stagnation maybe StopWhenCostStagnates?

I think we usually say that the algorithm stagnates, not the cost. Maybe StopWhenEvolutionStagnates?

We could maybe check for an interpreting name? What does this mean that sigma times the largest eigenvalue of the covariance matrix exceeds some value?

TolXUp when including a warning about possible too small sigma, this could also be called StopWhenPopulationDiverges?

That's a good idea.

TolFun This is siilar, but with a longer observation phase than a StopWhenCostDecreaseLess ? Or even StopWhenCostLess? Hansen speaks about range of function values, so one could say StopWhenPopulationCostConcentrated?

I like StopWhenPopulationCostConcentrated.

TolXI think this is basically StopWhenPopulationConcentrated?

TolX is a bit stronger because it also demands that the rank-1 update of covariance matrix is small, so it's more "population is concentrated and unlikely to get de-concentrated".

The overview paper, though Euclidean is great, but we could also mentiond https://ieeexplore.ieee.org/document/5299260
Also, the bib entry can be improved to remove the DOI and change the URL to eprint/eprinttype, but I can do that somewhen later as well, but of course also Dreisigmeyer (already in the bib) https://optimization-online.org/wp-content/uploads/2017/03/5916.pdf – I would have to check the right reference still.

Thanks for the link to Colutto's paper, I must have missed it. They seem to skip the parallel transport of covariance matrix and base their work on an older variant of the Euclidean CMA-ES but otherwise it's more or less the same thing. The Dreisigmeyer's paper doesn't mention CMA-ES but it's also about direct optimization on manifolds so maybe it could be mentioned somewhere.

kellertuer · 2024-04-01T13:06:17Z

I think we usually say that the algorithm stagnates, not the cost. Maybe StopWhenEvolutionStagnates?

That is also fine with me, I just thought personally, it could also be a very flat area, where the Evolution continues but the cost stagnates.

So both EqualFunValues and its weaker cousin TolX still need a good name that is stronger than Population concentrated, hm. I do not yet have a good name here, but I like the others so far.

For the Collate paper, I did not yet check that too closely, but it might still be fair to mention them.
I haven‘t had the time to check too much versus Dreisigmayer, but my student will work on his mesh version next semester then, though I missed the paper slightly in my last post I meant to mention https://optimization-online.org/wp-content/uploads/2007/08/1742.pdf –and LTMADS is a project for said student (but sure that is more grid based).

src/plans/stopping_criterion.jl

mateuszbaran · 2024-04-01T17:58:06Z

I've checked performance. For non-trivial examples it's bound by either objective calculation or eigendecomposition, so I've reworked the code a bit to make sure only one eigendecomposition per iteration is made. A standard trick in Euclidean CMA-ES is updating decomposition every few iterations rather than every single one but to make it work here we'd still need fast decomposition transport. It could be a fun follow-up project but for this PR I think it's fast enough.

docs/src/solvers/cma_es.md

src/plans/stopping_criterion.jl

src/solvers/cma_es.jl

mateuszbaran · 2024-04-01T18:01:56Z

@kellertuer the latest failure is due to convex bundle method, maybe something you'd like to take a look at?

┌ Warning: WolfePowellLinesearch
│   caller = ip:0x0
└ @ Core :-1
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #11 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #12 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #13 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #14 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #15 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #16 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #17 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #18 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #19 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
┌ Warning: The Lagrange multiplier is positive.
│ At iteration #20 the negative of the Lagrange multiplier, -ξ, became negative.
│ 
│ Consider increasing either the `diameter` keyword argument, or changing
│ one of the parameters involved in the estimation of the sectional curvature, such as
│ `k_max`, or `ϱ` in the `convex_bundle_method` call.
└ @ Manopt ~/work/Manopt.jl/Manopt.jl/src/solvers/convex_bundle_method.jl:587
A simple median run: Test Failed at /Users/runner/work/Manopt.jl/Manopt.jl/test/solvers/test_convex_bundle_method.jl:178
  Expression: distance(M, q2, m) < 0.01
   Evaluated: 0.01322676092[446](https://github.com/JuliaManifolds/Manopt.jl/actions/runs/8511560037/job/23311403520?pr=373#step:6:449)8948 < 0.01

Stacktrace:
 [1] macro expansion
   @ ~/hostedtoolcache/julia/1.9.4/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined]
 [2] macro expansion
   @ ~/work/Manopt.jl/Manopt.jl/test/solvers/test_convex_bundle_method.jl:178 [inlined]
 [3] macro expansion
   @ ~/hostedtoolcache/julia/1.9.4/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
 [4] macro expansion
   @ ~/work/Manopt.jl/Manopt.jl/test/solvers/test_convex_bundle_method.jl:150 [inlined]
 [5] macro expansion
   @ ~/hostedtoolcache/julia/1.9.4/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
 [6] top-level scope
   @ ~/work/Manopt.jl/Manopt.jl/test/solvers/test_convex_bundle_method.jl:6
WARNING: using JuMP.VectorConstraint in module Main conflicts with an existing identifier.

kellertuer · 2024-04-01T18:04:15Z

Increase the tolerance, we are reworking that algorithm a bit currently anyways, since those warnings and errors appeared more than we thought.

mateuszbaran · 2024-04-01T18:06:47Z

Increase the tolerance, we are reworking that algorithm a bit currently anyways, since those warnings and errors appeared more than we thought.

Sure, that's good to know.

mateuszbaran · 2024-04-02T08:51:42Z

CMA-ES seems to handle this problem: JuliaManifolds/ManoptExamples.jl#13 fairly well, and the performance looks competitive compared to Evolutionary.jl so I'd say this can be reviewed now.

kellertuer

Thanks for this nice contribution, here is a few comments – I have not yet looked at the details of the stopping criteria, but they probably could also have longer doc strings I think.

docs/src/references.bib

docs/src/solvers/cma_es.md

src/solvers/cma_es.jl

Co-authored-by: Ronny Bergmann <git@ronnybergmann.net>

mateuszbaran · 2024-04-05T11:02:37Z

I think I've addressed all your points.

kellertuer · 2024-04-05T13:00:01Z

I would still prefer a nicer constructor for the state with (a) the manifold as first argument to fill defaults properly) and that (b) initialises all internal / coipy things. The idea is that cma_es can call this one and is easier to read in code itself then.

The current constructor does have M first, but retraction and vector transports for ecxample can be set to nice default (and become kwargs) so can the basis and rng, maybe even stop.

The rest looks fine so far, I think.

mateuszbaran · 2024-04-05T13:18:43Z

I've added defaults for some arguments but many of them could potentially be changed if someone wants to experiment. There is quite a lot of logic in cma_es! because it reflects the Hansen's paper, and providing them as defaults in the state constructor would be a bit messy.

kellertuer · 2024-04-05T13:21:57Z

Yes I saw that part of the logic.
You know changed at least the manifold part to have nice defaults, which is in line with the other solvers. That's nice.

docs/src/solvers/cma_es.md

src/solvers/cma_es.jl

kellertuer

thanks for the thorough work, I approve this for now, a tutorial for each solver would maybe be nice in the long run.

mateuszbaran · 2024-04-06T10:35:33Z

A tutorial could definitely be useful for the more involved solvers, this one is actually fairly straightforward to use.

I think we can wait with registering a new Manopt version until #376 is merged.

kellertuer · 2024-04-06T10:38:54Z

Still, a small tutorial for most solvers might be nice – maybe also just one tutorial for all “derivative-free” ones, since their calls are really similar.

And sure with registration let's wait for that other PR.

mateuszbaran added 4 commits March 28, 2024 15:44

initial work on CMA-ES

55e7817

new reference

66f2d8f

transport p vectors, better printing and documentation

89526a0

cond stopping criterion and some docs

380ed03

mateuszbaran added the WIP Work in Progress (for a pull request) label Mar 29, 2024

add test file to runtests

c03312b

kellertuer reviewed Mar 30, 2024

View reviewed changes

src/plans/stopping_criterion.jl Show resolved Hide resolved

mateuszbaran added 2 commits March 31, 2024 18:31

even more stopping criteria

b5524e7

Add TolFunCondition

bd800bb

kellertuer reviewed Mar 31, 2024

View reviewed changes

docs/src/solvers/cma_es.md Outdated Show resolved Hide resolved

poorly conditioned test case and some docs

236be24

mateuszbaran added 4 commits April 1, 2024 15:22

renaming

857faa2

a bit of docs

156f4fe

tests for stopping criteria

be4cca5

Performance improvements

6aadb23

mateuszbaran commented Apr 1, 2024

View reviewed changes

src/plans/stopping_criterion.jl Show resolved Hide resolved

kellertuer reviewed Apr 1, 2024

View reviewed changes

docs/src/solvers/cma_es.md Outdated Show resolved Hide resolved

src/plans/stopping_criterion.jl Show resolved Hide resolved

src/solvers/cma_es.jl Show resolved Hide resolved

bump tolerance

354ea46

mateuszbaran added 2 commits April 2, 2024 08:25

Renaming, new test

4f7abbb

some docs

32ac235

mateuszbaran added Ready-for-Review A label for pull requests that are feature-ready and removed WIP Work in Progress (for a pull request) labels Apr 2, 2024

mateuszbaran marked this pull request as ready for review April 2, 2024 08:49

kellertuer reviewed Apr 4, 2024

View reviewed changes

mateuszbaran and others added 4 commits April 4, 2024 19:18

Apply suggestions from code review

901e2c8

Co-authored-by: Ronny Bergmann <git@ronnybergmann.net>

address some review comments

c4f65fd

addressing review

dcded24

coverage

8277b17

add defaults, more asserts

99e5c93

kellertuer reviewed Apr 5, 2024

View reviewed changes

docs/src/solvers/cma_es.md Outdated Show resolved Hide resolved

src/solvers/cma_es.jl Outdated Show resolved Hide resolved

src/solvers/cma_es.jl Outdated Show resolved Hide resolved

mateuszbaran added 2 commits April 5, 2024 17:47

Update docs

8eb3236

minor docs improvements

8c39dca

kellertuer approved these changes Apr 5, 2024

View reviewed changes

mateuszbaran merged commit 7f8d7c6 into master Apr 6, 2024
15 checks passed

kellertuer deleted the mbaran/cma-es branch May 4, 2024 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CMA-ES #373

CMA-ES #373

mateuszbaran commented Mar 29, 2024 •

edited

codecov bot commented Mar 29, 2024 •

edited

kellertuer commented Mar 30, 2024 •

edited

mateuszbaran commented Mar 31, 2024

kellertuer commented Mar 31, 2024

mateuszbaran commented Mar 31, 2024

kellertuer left a comment

kellertuer commented Mar 31, 2024

mateuszbaran commented Mar 31, 2024 •

edited

mateuszbaran commented Mar 31, 2024

kellertuer commented Mar 31, 2024

kellertuer commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

kellertuer commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

kellertuer commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

mateuszbaran commented Apr 2, 2024

kellertuer left a comment

mateuszbaran commented Apr 5, 2024

kellertuer commented Apr 5, 2024

mateuszbaran commented Apr 5, 2024

kellertuer commented Apr 5, 2024

kellertuer left a comment

mateuszbaran commented Apr 6, 2024

kellertuer commented Apr 6, 2024

CMA-ES #373

CMA-ES #373

Conversation

mateuszbaran commented Mar 29, 2024 • edited

codecov bot commented Mar 29, 2024 • edited

Codecov Report

kellertuer commented Mar 30, 2024 • edited

mateuszbaran commented Mar 31, 2024

kellertuer commented Mar 31, 2024

mateuszbaran commented Mar 31, 2024

kellertuer left a comment

Choose a reason for hiding this comment

kellertuer commented Mar 31, 2024

mateuszbaran commented Mar 31, 2024 • edited

mateuszbaran commented Mar 31, 2024

kellertuer commented Mar 31, 2024

kellertuer commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

kellertuer commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

kellertuer commented Apr 1, 2024

mateuszbaran commented Apr 1, 2024

mateuszbaran commented Apr 2, 2024

kellertuer left a comment

Choose a reason for hiding this comment

mateuszbaran commented Apr 5, 2024

kellertuer commented Apr 5, 2024

mateuszbaran commented Apr 5, 2024

kellertuer commented Apr 5, 2024

kellertuer left a comment

Choose a reason for hiding this comment

mateuszbaran commented Apr 6, 2024

kellertuer commented Apr 6, 2024

mateuszbaran commented Mar 29, 2024 •

edited

codecov bot commented Mar 29, 2024 •

edited

kellertuer commented Mar 30, 2024 •

edited

mateuszbaran commented Mar 31, 2024 •

edited