Bound and Scaling questions #195

Optiuse · 2018-12-19T15:20:43Z

Hello and thanks, that is a wonderful library.

I would like to improve with cmaes the optimization time over bruteforce start/step/stop-scheme.
I work a lot with different types and bounds of values in the functions (bool, int, double). This library seems works with double so I round the double values to the data type before give them to the function. Is this the right way to deal with bool and integer Values?

But the important thing is this:

I did some basic tests with 2 dimensions (double):
I have bounds of:
Var1: 0.2 – 4.0
Var2: 20 – 400

For x0 I set the middle of the bounds. x0[i] = (Start+Stop) / 2;

code:

int dim = libcmaesINPUT.size(); // my vector with struct with my start step stop bounds
double sigma = -1;
int lambda = -1;
uint64_t seeding = 0;
double lbounds[dim],ubounds[dim]; 
std::vector<double> x0(dim); 

for(int i = 0; i != libcmaesINPUT.size(); ++i)
{
   lbounds[i] = libcmaesINPUT[i].start;
   ubounds[i] = libcmaesINPUT[i].stop;
   x0[i] = (libcmaesINPUT[i].start + libcmaesINPUT[i].stop)*0.5;
}

GenoPheno<pwqBoundStrategy,linScalingStrategy> gp(lbounds,ubounds,dim);
CMAParameters<GenoPheno<pwqBoundStrategy,linScalingStrategy>> cmaparams(dim,&x0.front(),sigma,lambda,seeding,gp); 
cmaparams.set_algo(aCMAES);
CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy,linScalingStrategy>>(FUNC_CMAES_OPTIMIZATION,cmaparams);

Eigen::VectorXd bestparameters = gp.pheno(cmasols.get_best_seen_candidate().get_x_dvec());
std::cout << "bestparameters: " << bestparameters << std::endl;
std::cout << "best solution: ";
cmasols.print(std::cout << std::fixed,0,gp);
std::cout << std::endl;
std::cout << "optimization took " << cmasols.elapsed_time() / 1000.0 << " seconds\n";

I log internally in the function all generated return values.

Graphs:
left graph = returnF (is displayed as a positive value! (*-1)),
middle graph = Var1,
right graph = Var2

Test: With BruteForce Start/Step/Stop-scheme (StepSize: Var1=0.1 / Var2=5)

3003 runs
the global minimum is lower than about -1400

Test: CMAES with sigma/lambda both = -1:

420 runs
the bounds area is not fully utilized (see graph) (Var1 only 1.7-2.4, Var2 only 180-290)
does not find the global minimum
internally log: lowest minimum is about -534
but libcmaes says: best solution f-value = -387

Test: CMAES with sigma/lambda both = 10:

3230 runs (more than BruteForce)
Positive is the bounds area is fully utilized (see graph)
internally log: lowest minimum is about -1488
but libcmaes says: best solution f-value = -511

My internal log shows that in the process, some return values are generated which are smaller than the best solution (f-value) by libcmaes. The graph shows smaller (displayed in graph as higher *-1) values during optimization as at the end. Why it does libcmaes not track the smallest value?

With auto sigma/lambda = -1 the bounds area are not fully utilized.
Why does the automatic setting not find (based of the bound) the sigmars that the bounds area full utilized?

greetings

The text was updated successfully, but these errors were encountered:

nikohansen · 2018-12-19T22:38:20Z

This library seems works with double so I round the double values to the data type before give them to the function. Is this the right way to deal with bool and integer Values?

It is a possible way, but I wouldn't expect any algorithm designed for continuous variables (as CMA-ES is) to work exceptionally or even reasonably well on binary variables out of the box.

Why it does libcmaes not track the smallest value?

It should, maybe it does? If not, the burden to keep the record is left to the user (it's not rocket science, only burdensome).

Why does the automatic setting not find (based of the bound) the sigmars that the bounds area full utilized?

In general, bounds and the initial sigma(s) are on purpose seperate input variables that do not necessarily need to agree. The algorithm can be used to search the full space, but it can also be used to locally improve a provided initial solution even when bounds are given. The decision is up to the user based on the initial sigma(s). In the above case, either the problem should be re-scaled or different sigmas need to be given (I don't know by heart whether libcmaes provides this option).

Specifically, it seems that in libcmaes sigma can be set to -1 (which I don't like very much, because it leads to an arbitrary decision in the unbounded case). In the bounded case and with sigma=-1, the sigma could indeed be chosen based on the given bounds, see e.g. here.

Optiuse · 2018-12-20T01:52:45Z

It should, maybe it does? If not, the burden to keep the record is left to the user (it's not rocket science, only burdensome).

During optimization, the function returns values smaller than the minimum found by libcmaes.
I save in the function each return value with each call and get the full optimization process.
Here is another test with sigma = 10 and lambda = -1
Here graphically the sequence of the return values of the optimization. The representation is *-1

On the left the optimization starts and right it is finished. You can see that there are some iterations in the first half that were much smaller (values in graphic positive) than the found minimum from libcmaes at end.

the smallest return value during the iteration was -1383
libcmaes converges to -530 and correspondingly says final the best solution is -530

Something does not seem to be right… ?

In the above case, either the problem should be re-scaled or different sigmas need to be given (I don't know by heart whether libcmaes provides this option).

in the code the sigma is only one double variable for all dimensions. Is it possible with libcmaes to give each dimension its own sigma? But with scaling, this might not be necessary...
In the code above I use GenoPheno with „pwqBoundStrategy“ and „linScalingStrategy“ as shown here:
https://github.com/beniz/libcmaes/wiki/Using-parameter-space-transforms-known-as-genotype-phenotype-transforms
there is: „when the parameter space is bounded with known bounds, the scaling can then be automatically determined by the library“

is the defined sigma applied to the phenotype or genotype?
If sigma applied to genotype, how do I know how the bounds are in the genotype used by libcmaes?

greetings

nikohansen · 2018-12-21T14:23:19Z

Something does not seem to be right… ?

It's just the wrong return value in case of non-noisy functions, I guess. When the function is noisy it's not a good idea to return the solution that got the best value somewhere during the run.

I can't speak to the other questions, as I am not intimately familiar with all interface options of libcmaes.

Optiuse · 2018-12-21T16:50:15Z

Thanks, that's interesting! cmaes is still new to me.
Ok so cmaes is not looking for the absolute smallest value?
Can I understand it something like this: it looks for the "smallest value in the most stable parameter field"?

nikohansen · 2018-12-22T15:06:44Z

The analogy makes some sense, however it is not necessarily the most stable. Using the best-ever seen solution as attractor wouldn't stand necessarily in contradiction with stability, as any seen solution must have some positive attraction mass around it. Ignoring the best solution within the algorithm has conceptual (noisy fitness, unbiasedness) and technical (unbiasedness) reasons.

The smaller the population size (or rather the parent number mu) and in particular the smaller sigma, the more likely it is to converge into the attraction basin of a single good evaluated solution.

BTW, losing the attractor around the best solution is to my experience less likely to occur in larger dimension.

kostayScr · 2019-02-09T10:30:32Z

Please see this bug - it explains what is happening:
#199

beniz · 2019-02-09T14:37:44Z

It should, maybe it does? If not, the burden to keep the record is left to the user (it's not rocket science, only burdensome).

It does, use get_best_seen_candidate(). And apologies for the late answer, I was taken elsewhere.

beniz · 2019-02-09T14:39:43Z

In the above case, either the problem should be re-scaled or different sigmas need to be given (I don't know by heart whether libcmaes provides this option).

The API provides a way to specify a single sigma value.

beniz · 2019-02-09T14:48:02Z

If sigma applied to genotype, how do I know how the bounds are in the genotype used by libcmaes?

The bounds are fixed, see https://github.com/beniz/libcmaes/blob/7514782ebe7167c6a889173364c980874ac17b41/src/scaling.h#L147 and recommendations from this page apply: http://cma.gforge.inria.fr/cmaes_sourcecode_page.html#practical

beniz mentioned this issue Feb 9, 2019

Best solution (f-value and params) printed as best by CMASolutions object differs from best seen solution #199

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bound and Scaling questions #195

Bound and Scaling questions #195

Optiuse commented Dec 19, 2018

nikohansen commented Dec 19, 2018

Optiuse commented Dec 20, 2018

nikohansen commented Dec 21, 2018

Optiuse commented Dec 21, 2018

nikohansen commented Dec 22, 2018

kostayScr commented Feb 9, 2019

beniz commented Feb 9, 2019

beniz commented Feb 9, 2019

beniz commented Feb 9, 2019

Bound and Scaling questions #195

Bound and Scaling questions #195

Comments

Optiuse commented Dec 19, 2018

nikohansen commented Dec 19, 2018

Optiuse commented Dec 20, 2018

nikohansen commented Dec 21, 2018

Optiuse commented Dec 21, 2018

nikohansen commented Dec 22, 2018

kostayScr commented Feb 9, 2019

beniz commented Feb 9, 2019

beniz commented Feb 9, 2019

beniz commented Feb 9, 2019