Compare the results of limbo with BayesOPT #13

jbmouret · 2015-10-16T13:54:18Z

Criteria:

accuracy (do we find the global optimum?)
speed (in how much time?)

fedeallocati · 2015-11-20T16:25:35Z

The first thing I tried is the simplest scenario possible, optimizing the Branin function:

Both using a plain GP, with no hyperparameters optimization
Kernel Matern 5/2, with l = 1 and sigma = 1
Constant mean = 1
Initialization with 10 uniform random samples
Acquisition UCB(LCB for BayesOpt, they minimize) with alpha = 0.125
For the optimization of the acquisition, in both cases used NLOpt's DIRECT algorithm, with 500 iterations

To be fair, I had to remove some tricks that BayesOpt perform. For example, when optimizing the acquisition, they perform first a global search with DIRECT, and then a shorter local search with BOBYQA; also, after this optimization, they perform some random variations of the current minimum sample and then a small local search around it; another trick is that if the difference between two consecutive samples(squared) is less than the noise for N iterations (N configured by the user) then it performs a totally random sample. Maybe there are more tricks that I haven't seen, but I think this are the main ones.
I compiled both binaries with the same GCC, and no MKL for limbo. I'm performing 50000 runs of each in the cluster, so it will take some time. I will comment later with preliminary results.

costashatz · 2015-11-20T16:27:29Z

The first thing I tried is the simplest scenario possible, optimizing the Branin function:

Both using a plain GP, with no hyperparameters optimization
Kernel Matern 5/2, with l = 1 and sigma = 1
Constant mean = 1
Initialization with 10 uniform random samples
Acquisition UCB(LCB for BayesOpt, they minimize) with alpha = 0.125
For the optimization of the acquisition, in both cases used NLOpt's DIRECT algorithm, with 500 iterations
To be fair, I had to remove some tricks that BayesOpt perform. For example, when optimizing the acquisition, they perform first a global search with DIRECT, and then a shorter local search with BOBYQA; also, after this optimization, they perform some random variations of the current minimum sample and then a small local search around it; another trick is that if the difference between two consecutive samples(squared) is less than the noise for N iterations (N configured by the user) then it performs a totally random sample. Maybe there are more tricks that I haven't seen, but I think this are the main ones.
I compiled both binaries with the same GCC, and no MKL for limbo. I'm performing 50000 runs of each in the cluster, so it will take some time. I will comment later with preliminary results.

Really curious to see what the results will be.

Aneoshun · 2015-11-20T16:38:57Z

Hi,

I am curious about the results too.
And I am quite surprised to see that Bayes Opt use so many tricks.

However, be careful as your benchmark should consider the “vanilla” version of the lib.
While, it’s definitely interesting to see if without all these tricks both libs find the same results and if our implementation is more efficient that BayesOpt, we need also to know which lib finds the best results or which libs is the most efficient (speed) in their standard form.
I agree that you should set all the “user parameters” to the same values, but you should also keep the tricks provided by BayesOpt.
These tricks may improve the accuracy of the optimization, but they will also penalize its speed.

Typically, a normal user of the libs will not go into the source code and remove these tricks (as you did), so the performances that we to compare are those that this user is likely to observe.

Do you see what I mean?

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 16:27, Konstantinos Chatzilygeroudis notifications@github.com wrote:

The first thing I tried is the simplest scenario possible, optimizing the Branin function:

Both using a plain GP, with no hyperparameters optimization
Kernel Matern 5/2, with l = 1 and sigma = 1
Constant mean = 1
Initialization with 10 uniform random samples
Acquisition UCB(LCB for BayesOpt, they minimize) with alpha = 0.125
For the optimization of the acquisition, in both cases used NLOpt's DIRECT algorithm, with 500 iterations
To be fair, I had to remove some tricks that BayesOpt perform. For example, when optimizing the acquisition, they perform first a global search with DIRECT, and then a shorter local search with BOBYQA; also, after this optimization, they perform some random variations of the current minimum sample and then a small local search around it; another trick is that if the difference between two consecutive samples(squared) is less than the noise for N iterations (N configured by the user) then it performs a totally random sample. Maybe there are more tricks that I haven't seen, but I think this are the main ones.
I compiled both binaries with the same GCC, and no MKL for limbo. I'm performing 50000 runs of each in the cluster, so it will take some time. I will comment later with preliminary results.

Really curious to see what the results will be.

—
Reply to this email directly or view it on GitHub #13 (comment).

fedeallocati · 2015-11-20T16:43:20Z

Hi Antoine,
I totally agree with you, but first wanted to have a "base" comparison of both libraries competing at equality of conditions. Then, the next steps will be to start adding this tricks, and enabling different functionalities.

Aneoshun · 2015-11-20T16:45:43Z

You are completely right. I look forward to see the results!

Cheers,

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 16:43, Federico Allocati notifications@github.com wrote:

Hi Antoine,
I totally agree with you, but first wanted to have a "base" comparison of both libraries competing at equality of conditions. Then, the next steps will be to start adding this tricks, and enabling different functionalities.

—
Reply to this email directly or view it on GitHub #13 (comment).

fedeallocati · 2015-11-20T16:51:11Z

Preliminary results with ~2100 runs

Accuracy

In Branin there are 3 points global optima, and for numeric reasons they give similar results but not exactly the same:
(0.1238938,0.81833299999999998) - > 0.39788735776433271
(0.54277280000000006,0.151667) -> 0.39788735775195327
(0.96165199999999995,0.16500000000000001) -> 0.39788735775266204

To compensate for this, I outputted the absolute difference between the best observation and the value at the closests global optimum to it.

	limbo	bayesopt
Samples	2164.000	2158.000
Upper whisker	7.699362e-03	7.222849e-05
3rd quartile	3.107246e-03	2.920610e-05
Median	2.377636e-04	2.699465e-06
1st quartile	4.583516e-05	5.245012e-07
Lower whisker	1.145036e-07	3.402310e-10

p value: 4.109789e-229

Here there is a picture zoomed in at the important part:

Note that both they and us have some outliers sometimes, which sometimes go up to a value near 11. Here you can appreciate them:

We can see that they perform a little better (medians 0.0002377636 vs 0.000002699465)

Speed

I measured the runtime, including the time to instantiate the classes and to run the optimization. In this case, limbo clearly wins (but let's not get excited, I think it's because in this scenario the kernel optimization is turned off).

	limbo	bayesopt
Count	2164.000	2158.000
Upper whisker	1.329500e+03	3.509500e+03
3rd quartile	1.268000e+03	3.214000e+03
Median	1.239000e+03	3.129000e+03
1st quartile	1.227000e+03	3.017000e+03
Lower whisker	1.165500e+03	2.721500e+03

Weirdly, I get a p-value of 0 (maybe just for numerical reasons).

In this settings we run almost 3 times faster than them. The two main reasons could be that they use Boost uBlas and we use Eigen, and that they have a super complex object oriented architecture, with lots of inheritance, and we have the optimized static classes.

After the weekend I will upload the full run.

Aneoshun · 2015-11-20T17:00:22Z

Thank you for these preliminary results.

Do you have an idea about why BayesOpt performs better than Limbo in terms of accuracy?
Is it because of the inner-optimization process? Or because of numerical reasons?

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 16:51, Federico Allocati notifications@github.com wrote:

Preliminary results with ~2100 runs

Accuracy

In Branin there are 3 points global optima, and for numeric reasons they give similar results but not exactly the same:
(0.1238938,0.81833299999999998) - > 0.39788735776433271
(0.54277280000000006,0.151667) -> 0.39788735775195327
(0.96165199999999995,0.16500000000000001) -> 0.39788735775266204

To compensate for this, I outputted the absolute difference between the best observation and the value at the closests global optimum to it.

limbo bayesopt
Samples 2164.000 2158.000
Upper whisker 7.699362e-03 7.222849e-05
3rd quartile 3.107246e-03 2.920610e-05
Median 2.377636e-04 2.699465e-06
1st quartile 4.583516e-05 5.245012e-07
Lower whisker 1.145036e-07 3.402310e-10
p value: 4.109789e-229

Here there is a picture zoomed in at the important part:
https://cloud.githubusercontent.com/assets/7495532/11305362/97f4260c-8fac-11e5-9f1d-a56bb3b38f86.png
Note that both they and us have some outliers sometimes, which sometimes go up to a value near 11. Here you can appreciate them:
https://cloud.githubusercontent.com/assets/7495532/11305433/eed9fe06-8fac-11e5-8971-afeec0fa4aad.png
We can see that they perform a little better (medians 0.0002377636 vs 0.000002699465)

Speed

I measured the runtime, including the time to instantiate the classes and to run the optimization. In this case, limbo clearly wins (but let's not get excited, I think it's because in this scenario the kernel optimization is turned off).

limbo bayesopt
Count 2164.000 2158.000
Upper whisker 1.329500e+03 3.509500e+03
3rd quartile 1.268000e+03 3.214000e+03
Median 1.239000e+03 3.129000e+03
1st quartile 1.227000e+03 3.017000e+03
Lower whisker 1.165500e+03 2.721500e+03
Weirdly, I get a p-value of 0 (maybe just for numerical reasons).

https://cloud.githubusercontent.com/assets/7495532/11305788/918b8dda-8fae-11e5-9bc1-4b6f922d5482.png
In this settings we run almost 3 times faster than them. The two main reasons could be that they use Boost uBlas and we use Eigen, and that they have a super complex object oriented architecture, with lots of inheritance, and we have the optimized static classes.

After the weekend I will upload the full run.

—
Reply to this email directly or view it on GitHub #13 (comment).

fedeallocati · 2015-11-20T17:06:09Z

Honestly, I have no idea. I'm using the exact same inner-optimization process for both (DIRECT algorithm from NLOpt library, I used the nlopt branch of limbo, with some modifications to match BayesOpt).

Aneoshun · 2015-11-20T17:25:22Z

Hum,.. interesting.

Except the random initialization of the GP, which parts of the algorithm(s) are stochastic?
DIRECT is deterministic isn’t it?

I am asking this, because if all the algorithms are deterministic then we can compare them step by step and see when the behave differently.
So, if they are both deterministic (except, the initialization), I can suggest to change the initialization procedure to make something deterministic and common for both algorithms (e.g., you can provide manually the 10 samples) and then compare step by step which sample they are selecting and which values they are receiving. Theoretically, we should have identical outputs for both of the algorithms. If it’s not the case, then you can infer the source of the difference. If the sample selection differ, then it may be in the kernel inversion (need to be more investigated afterwards), and if it’s the returned values that differ, then it’s probably a numerical reasons.

Do the two algorithms use different inversion procedures? (if our inversion is less “exact” then it may also explain why limbo is faster)

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 17:06, Federico Allocati notifications@github.com wrote:

Honestly, I have no idea. I'm using the exact same inner-optimization process for both (DIRECT algorithm from NLOpt library, I used the nlopt branch of limbo, with some modifications to match BayesOpt).

—
Reply to this email directly or view it on GitHub #13 (comment).

fedeallocati · 2015-11-20T17:39:02Z

I think that DIRECT is deterministic, and the only point of stochasticness is the initialization. I will try next week providing the samples. With limbo is super easy, but for BayesOpt I have to dig deeper in the code to modify it.
Also, what we do is use Eigen's Cholesky decomposition, recomputing it fully at each step, while they only add the new row to it, and I guess they use somo uBlas functionality for this, but haven't checked so deeply the library to see exactly how they do it, just some high-level calls. Anyways, what you say makes sense, and this could be the main point of difference.

Aneoshun · 2015-11-20T17:53:47Z

In the case where there is no hyper-parameter optimization, then the kernel of the next iteration contains the kernel of the current one (and only adds one line and one row)
In such circumstances, you can use the blockwise inversion theorem (https://en.wikipedia.org/wiki/Invertible_matrix#Blockwise_inversion).
It’s typically, what I used for the blacklisting of samples.
I don’t know if it’s exactly what they do (mainly because it does not work with likelihood optimization) but this may lead to a better kernel inversion (it always relies on the previous inversion and it’s exact, contrary to most of technics that inverse large matrices from scratch).

Enjoy your week end!

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 17:39, Federico Allocati notifications@github.com wrote:

I think that DIRECT is deterministic, and the only point of stochasticness is the initialization. I will try next week providing the samples. With limbo is super easy, but for BayesOpt I have to dig deeper in the code to modify it.
Also, what we do is use Eigen's Cholesky decomposition, recomputing it fully at each step, while they only add the new row to it, and I guess they use somo uBlas functionality for this, but haven't checked so deeply the library to see exactly how they do it, just some high-level calls. Anyways, what you say makes sense, and this could be the main point of difference.

—
Reply to this email directly or view it on GitHub #13 (comment).

fedeallocati · 2015-11-20T18:06:40Z

Yeap, we know it, and discussed it briefly, but to do that we would have to change the API of the GP to allow incremental updates of the model, and like you said, when there is optimization of the hyperparameters, everything gets recomputed, so we think it's not really worth it.

What they do is they perform incremental updates of the cholesky decomposition, and every N iterations(configurable by the user) they optimize the hyperparameters, and recompute the full decomposition. In this case, since there is no hyperaparameter optimization, they always perform the incremental updates, which, as you said, may lead to greater stability.

Good weekend for you!

fedeallocati · 2015-11-23T09:48:29Z

Hello! For some reason, not all jobs ran in the cluster, and I ended up with only 17743 samples, but I think they are more than enough. They just confirm what I posted previously:

Accuracy

	limbo	bayesopt
Count	17743.000	17743.000
Upper whisker	9.538656e-03	7.846090e-05
3rd quartile	3.846090e-03	3.171025e-05
Median	2.611418e-04	2.584184e-06
1st quartile	5.104675e-05	5.431500e-07
Lower whisker	1.599025e-08	2.161293e-11

Speed

	limbo	bayesopt
Count	17743.000	17743.000
Upper whisker	1.371000e+03	3.568000e+03
3rd quartile	1.287000e+03	3.244000e+03
Median	1.245000e+03	3.150000e+03
1st quartile	1.231000e+03	3.028000e+03
Lower whisker	1.147000e+03	2.704000e+03

In these days I will continue benchamarking.

jbmouret · 2015-11-23T10:40:53Z

Thank you. This is very interesting. What is your stopping criterion?

On Mon, Nov 23, 2015 at 10:53 AM Federico Allocati notifications@github.com
wrote:

Hello! For some reason, not all jobs ran in the cluster, and I ended up
with only 17743 samples, but I think they are more than enough. They just
confirm what I posted previously:
Accuracy limbo bayesopt Count 17743.000 17743.000 Upper whisker
9.538656e-03 7.846090e-05 3rd quartile 3.846090e-03 3.171025e-05 Median
2.611418e-04 2.584184e-06 1st quartile 5.104675e-05 5.431500e-07 Lower
whisker 1.599025e-08 2.161293e-11

[image: branin_simple_accuracy]
https://cloud.githubusercontent.com/assets/7495532/11333591/1cff1bf8-91cf-11e5-8007-3d3fe382ce42.png
Speed limbo bayesopt Count 17743.000 17743.000 Upper whisker 1.371000e+03
3.568000e+03 3rd quartile 1.287000e+03 3.244000e+03 Median 1.245000e+03
3.150000e+03 1st quartile 1.231000e+03 3.028000e+03 Lower whisker
1.147000e+03 2.704000e+03

[image: branin_simple_speed]
https://cloud.githubusercontent.com/assets/7495532/11333615/51e69832-91cf-11e5-8879-92c5e217bdb7.png

In these days I will continue benchamarking.

—
Reply to this email directly or view it on GitHub
#13 (comment).

fedeallocati · 2015-11-23T10:41:56Z

Forgot to write it, 190 iterations in both cases

Aneoshun · 2015-11-24T16:32:24Z

Hi all,

Do you have more explanations about the accuracy difference ?

Best,

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 23, 2015, at 10:41, Federico Allocati notifications@github.com wrote:

Forgot to write it, 190 iterations in both cases

—
Reply to this email directly or view it on GitHub #13 (comment).

fedeallocati · 2015-11-25T18:52:39Z

I have been working on other things, and will resume this tomorrow.

fedeallocati · 2015-11-26T17:29:50Z

I just tried both libraries, in the same scenario, but both starting with the same set of points (taken from here: http://mathematica.stackexchange.com/questions/47638/distribution-of-10-points-within-a-unit-square), and they give really different results. Limbo is able to find a suitable solution (with approx the same accuracy as before), while BayesOpt doesn't get it, and got stuck in a local optimum.
Since I removed all randomness, it doesn't make sense to run them multiple times (I checked just in case, running a few times each).
It is clear that we have some difference somewhere, whether by the recomputing of the cholesky decomposition on each step, by the computing itself of cholesky (we don't know how Eigen does it, and if it is the same that BayesOpt is doing; while in theory they should be the same, maybe some of the methods is more precise than the other one), or by numerical differences introduced by Eigen vs Boost uBlas. Do you think it's worth the effort in digging deeper on this?

Also, from now there are many things to try and paths to follow, so I propose that we stablish a "Roadmap" for the comparisons:

Add all the benchmark functions from bo_functions, to have a wider panorama
I will add another variable to the analysis, which is the speed in terms of iterations to find the minimum (with an appropiate tolerance value). As I discussed with JB, this is usually more important than the computational speed
Recompute the previous analyisis with this new variable
Restore BayesOpt tricks and compare again?
?

I'm waiting for suggestions on what scenarios are worth comparing to you. Should this scenarios be always fair? (for example, by default they don't recompute the hyperparameters every step, but every 50 I think, so should we set this parameter to 1, or leave the default?). Please write any ideas that you have!

costashatz · 2015-11-27T10:13:24Z

Add all the benchmark functions from bo_functions, to have a wider panorama
I will add another variable to the analysis, which is the speed in terms of iterations to find the minimum (with an appropiate tolerance value). As I discussed with JB, this is usually more important than the computational speed
Recompute the previous analyisis with this new variable

I agree with this.

Restore BayesOpt tricks and compare again

I think this is also a good idea.

Should this scenarios be always fair? (for example, by default they don't recompute the hyperparameters every step, but every 50 I think, so should we set this parameter to 1, or leave the default?)

I think we need to test the scenarios once with fair comparison and once with the defaults of each library.

fedeallocati · 2015-11-27T18:33:29Z

Okay people, I've got some news that explain the difference in accuracy and speed.
It was my fault, I didn't noticed that when performing the inner optimization, BayesOpt multiplies the number of iterations you set by the dimension of the function, so in the last case, they were running the double of iterations than us.
I'm running again, with 8 different functions to optimize, and taking also into account the dimension of the functions, and will upload the results (just 5000 runs of each, it would take too much time otherwise).
Some preliminary runs show that now we are more or less in the same order of magnitude regarding to accuracy (in some function we win by little, in others they do, but a really small difference), and that we are still faster(although not as much as before, less than 2 times faster than them).
From the theoretical point of view, it doesn't make sense that they are slower, so I think that now the difference is purely from Eigen vs Boost uBlas.
In the weekend I will try to upload the results.

jbmouret · 2015-11-27T18:39:18Z

Thank you. I am looking forward seeing the results!
� JBM

On 27 Nov 2015, at 19:33, Federico Allocati notifications@github.com wrote:

Okay people, I've got some news that explain the difference in accuracy and speed.
It was my fault, I didn't noticed that when performing the inner optimization, BayesOpt multiplies the number of iterations you set by the dimension of the function, so in the last case, they were running the double of iterations than us.
I'm running again, with 8 different functions to optimize, and taking also into account the dimension of the functions, and will upload the results (just 5000 runs of each, it would take too much time otherwise).
Some preliminary runs show that now we are more or less in the same order of magnitude regarding to accuracy (in some function we win by little, in others they do, but a really small difference), and that we are still faster(although not as much as before, less than 2 times faster than them).
From the theoretical point of view, it doesn't make sense that they are slower, so I think that now the difference is purely from Eigen vs Boost uBlas.
In the weekend I will try to upload the results.

�
Reply to this email directly or view it on GitHub.

� Jean-Baptiste Mouret
http://pages.isir.upmc.fr/~mouret/

jbmouret · 2015-11-27T18:39:48Z

I agree with Konstantinos on all of this.

� JBM
On 27 Nov 2015, at 11:13, Konstantinos Chatzilygeroudis notifications@github.com wrote:

Add all the benchmark functions from bo_functions, to have a wider panorama
I will add another variable to the analysis, which is the speed in terms of iterations to find the minimum (with an appropiate tolerance value). As I discussed with JB, this is usually more important than the computational speed
Recompute the previous analyisis with this new variable

I agree with this.

Restore BayesOpt tricks and compare again

I think this is also a good idea.

Should this scenarios be always fair? (for example, by default they don't recompute the hyperparameters every step, but every 50 I think, so should we set this parameter to 1, or leave the default?)

I think we need to test the scenarios once with fair comparison and once with the defaults of each library.

�
Reply to this email directly or view it on GitHub.

� Jean-Baptiste Mouret
http://pages.isir.upmc.fr/~mouret/

fedeallocati · 2015-11-27T18:42:09Z

Just so you can check, results with 1000 runs of each:

Accuracy

	limbo sphere	bayesopt sphere	limbo ellipsoid	bayesopt ellipsoid	limbo branin	bayesopt branin
Count	1000.000	1000.000	1000.000	1000.000	1000.000	1000.000
Upper whisker	1.563614e-08	1.145400e-07	1.763106e-02	1.750175e-02	8.314348e-03	5.072667e-05
3rd quartile	6.624448e-09	4.887992e-08	7.128387e-03	7.003101e-03	3.350847e-03	2.052798e-05
Median	2.752702e-09	1.854793e-08	7.612957e-04	1.321474e-04	2.352545e-04	2.037751e-06
1st quartile	6.166560e-10	5.106548e-09	1.266081e-04	4.004099e-06	4.184631e-05	3.955210e-07
Lower whisker	9.710000e-15	8.630000e-15	2.897569e-08	4.341416e-11	5.723248e-08	4.764828e-10

	limbo hartman3	bayesopt hartman3	limbo hartman6	bayesopt hartman6	limbo rastrigin	bayesopt rastrigin
Count	1000.000	1000.000	1000.000	1000.000	1000.000	1000.000
Upper whisker	1.367687e-03	8.651926e-05	3.424624e-01	2.984132e-01	3.500000e+00	3.500000e+00
3rd quartile	5.772985e-04	8.228316e-05	1.370208e-01	1.193896e-01	2.000000e+00	2.000000e+00
Median	7.675129e-05	8.204481e-05	2.327436e-03	9.921574e-04	1.999996e+00	2.000000e+00
1st quartile	5.037282e-05	7.945910e-05	5.969094e-05	4.057478e-05	9.999998e-01	1.000000e+00
Lower whisker	6.459865e-07	7.522300e-05	2.949481e-08	1.299830e-07	0.000000e+00	0.000000e+00

	limbo goldenprice	bayesopt goldenprice	limbo sixhumpcamel	bayesopt sixhumpcamel
Count	1000.000	1000.000	1000.000	1000.000
Upper whisker	3.985235e-02	5.634791e+00	3.162881e-02	3.162843e-02
3rd quartile	1.632651e-02	2.302588e+00	3.162859e-02	3.162842e-02
Median	2.411001e-03	3.928288e-05	3.162850e-02	3.162842e-02
1st quartile	6.426208e-04	2.224865e-06	3.162845e-02	3.162842e-02
Lower whisker	4.258365e-07	2.183231e-11	3.162843e-02	3.162842e-02

Speed

	limbo sphere	bayesopt sphere	limbo ellipsoid	bayesopt ellipsoid	limbo branin	bayesopt branin
Count	5000.000	1156.000	5000.000	1150.000	5000.000	1183.000
Upper whisker	2.421500e+03	3.532250e+03	2.682000e+03	3.800875e+03	2.718000e+03	4.144000e+03
3rd quartile	2.003000e+03	3.256250e+03	2.019000e+03	3.334000e+03	1.995000e+03	3.367000e+03
Median	1.959000e+03	3.197000e+03	1.964000e+03	3.249000e+03	1.939000e+03	3.284000e+03
1st quartile	1.724000e+03	3.072250e+03	1.577000e+03	3.022750e+03	1.513000e+03	2.838000e+03
Lower whisker	1.305500e+03	2.796250e+03	1.189000e+03	2.555875e+03	1.153000e+03	2.044500e+03

	limbo hartman3	bayesopt hartman3	limbo hartman6	bayesopt hartman6	limbo rastrigin	bayesopt rastrigin
Count	5000.000	1165.000	5000.000	1176.000	5000.000	1159.000
Upper whisker	3.659375e+03	5.070500e+03	7.302000e+03	1.097400e+04	4.767000e+03	5.867000e+03
3rd quartile	2.915000e+03	4.754000e+03	5.747250e+03	9.321250e+03	3.790000e+03	5.570000e+03
Median	2.842000e+03	4.653000e+03	5.620000e+03	9.114000e+03	3.702000e+03	5.477000e+03
1st quartile	2.418750e+03	4.543000e+03	4.579250e+03	8.109500e+03	3.108750e+03	5.372000e+03
Lower whisker	1.674375e+03	4.226500e+03	3.370000e+03	6.291875e+03	2.253000e+03	5.075000e+03

	limbo goldenprice	bayesopt goldenprice	limbo sixhumpcamel	bayesopt sixhumpcamel
Count	5000.000	1148.000	5000.000	1148.000
Upper whisker	2.656000e+03	4.071000e+03	2.576500e+03	3.724750e+03
3rd quartile	2.006000e+03	3.332000e+03	1.981000e+03	3.049000e+03
Median	1.956000e+03	3.251500e+03	1.926500e+03	2.990000e+03
1st quartile	1.551000e+03	2.787750e+03	1.584000e+03	2.598500e+03
Lower whisker	1.194000e+03	1.971375e+03	1.162000e+03	1.922750e+03

jbmouret · 2015-11-27T22:28:01Z

Thank you.

How do you measure accuracy? Is it the difference with the real optimum? If so, it seems that both lib fail for Rastrigin� (which is not unexpected).

A count of 1000 means 1000 iterations?? If so, this is a lot..

fedeallocati · 2015-11-27T22:30:03Z

Exactly, accuracy is the difference with the real optimum. The count is the
number of runs of the executables.
El nov. 27, 2015 23:28, "JB Mouret" notifications@github.com escribió:

Thank you.

How do you measure accuracy? Is it the difference with the real optimum?
If so, it seems that both lib fail for Rastrigin� (which is not unexpected).

A count of 1000 means 1000 iterations?? If so, this is a lot..

—
Reply to this email directly or view it on GitHub
#13 (comment).

fedeallocati · 2015-11-29T13:32:03Z

I've got the results with 5000 runs. It didn't changed much from the previous data:

Accuracy

	limbo sphere	bayesopt sphere	limbo ellipsoid	bayesopt ellipsoid	limbo branin	bayesopt branin
3rd quartile	6.532603e-09	5.217534e-08	7.719718e-03	7.002796e-03	4.920966e-03	2.790338e-05
Median	2.327818e-09	1.813649e-08	8.095894e-04	1.387372e-04	3.051706e-04	2.461750e-06
1st quartile	5.443109e-10	4.554303e-09	1.277711e-04	3.032496e-06	5.276195e-05	5.423679e-07

	limbo hartman3	bayesopt hartman3	limbo hartman6	bayesopt hartman6	limbo rastrigin	bayesopt rastrigin
3rd quartile	6.067495e-04	8.228137e-05	1.266144e-01	1.193084e-01	2.994957e+00	2.991187e+00
Median	7.840647e-05	8.209100e-05	2.426417e-03	5.560084e-04	1.999999e+00	2.000000e+00
1st quartile	5.589862e-05	8.047203e-05	5.868768e-05	2.315586e-05	9.999998e-01	1.000000e+00

	limbo goldenprice	bayesopt goldenprice	limbo sixhumpcamel	bayesopt sixhumpcamel
3rd quartile	1.291762e-02	2.302586e+00	3.162859e-02	3.162842e-02
Median	2.074041e-03	2.584832e-05	3.162850e-02	3.162842e-02
1st quartile	5.411675e-04	1.780868e-06	3.162845e-02	3.162842e-02

I think it makes no sense to upload the plots for this, since each function has it's own scale, and is not really informative, but if someone wants to, I can put them.

As you can see, we have more or less the same accuracy as them, except for the branin function(in which they are 2 orders of magnitude better), the hartman6(1 order better) and the goldenprice(2 orders better), and the simple sphere in which we are 1 order of magnitude better.
If you want to check if this differences are produced by the stability of their cholesky decomp, tell me, and I can rerun their algorithm but recomputing each time the full cholesky.

Speed

	limbo sphere	bayesopt sphere	limbo ellipsoid	bayesopt ellipsoid	limbo branin	bayesopt branin
Median	1.959000e+03	3.188000e+03	1.964000e+03	3.246000e+03	1.939000e+03	3.277000e+03

	limbo hartman3	bayesopt hartman3	limbo hartman6	bayesopt hartman6	limbo rastrigin	bayesopt rastrigin
Median	2.842000e+03	4.639000e+03	5.620000e+03	9.082000e+03	3.702000e+03	5.456000e+03

	limbo goldenprice	bayesopt goldenprice	limbo sixhumpcamel	bayesopt sixhumpcamel
Median	1.956000e+03	3.256000e+03	1.926500e+03	2.994000e+03

Some good news for us, we are always faster! The plots are clear, we really beat them in this aspect, with them being in average over all the functions 1.61 times slower than us. I'm pretty sure that the difference will be even bigger if I set the recalculation of the whole cholesky matrix on each iteration.

costashatz · 2015-12-17T15:13:44Z

Maybe after some bug fixes (like random_init) and the new cmaes library, we should re-run these experiments.

jbmouret · 2015-12-17T15:19:00Z

We should be able to easily run these experiments so that we can check that there is no regression.

� JBM
On 17 Dec 2015, at 16:13, Konstantinos Chatzilygeroudis notifications@github.com wrote:

Maybe after some bug fixes (like random_init) and the new cmaes library, we should re-run these experiments.

�
Reply to this email directly or view it on GitHub.

� Jean-Baptiste Mouret
http://pages.isir.upmc.fr/~mouret/

costashatz · 2015-12-17T15:21:28Z

We should be able to easily run these experiments so that we can check that there is no regression.

Good idea!

costashatz · 2016-07-23T09:40:07Z

This is solved by #96 ... Closing...

jbmouret added the enhancement label Oct 16, 2015

jbmouret assigned jbmouret and fedeallocati and unassigned jbmouret Dec 12, 2015

costashatz closed this as completed Jul 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare the results of limbo with BayesOPT #13

Compare the results of limbo with BayesOPT #13

jbmouret commented Oct 16, 2015

fedeallocati commented Nov 20, 2015

costashatz commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

fedeallocati commented Nov 23, 2015

jbmouret commented Nov 23, 2015

fedeallocati commented Nov 23, 2015

Aneoshun commented Nov 24, 2015

fedeallocati commented Nov 25, 2015

fedeallocati commented Nov 26, 2015

costashatz commented Nov 27, 2015

fedeallocati commented Nov 27, 2015

jbmouret commented Nov 27, 2015

jbmouret commented Nov 27, 2015

fedeallocati commented Nov 27, 2015

jbmouret commented Nov 27, 2015

fedeallocati commented Nov 27, 2015

fedeallocati commented Nov 29, 2015

costashatz commented Dec 17, 2015

jbmouret commented Dec 17, 2015

costashatz commented Dec 17, 2015

costashatz commented Jul 23, 2016

Compare the results of limbo with BayesOPT #13

Compare the results of limbo with BayesOPT #13

Comments

jbmouret commented Oct 16, 2015

fedeallocati commented Nov 20, 2015

costashatz commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Accuracy

Speed

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

Aneoshun commented Nov 20, 2015

fedeallocati commented Nov 20, 2015

fedeallocati commented Nov 23, 2015

Accuracy

Speed

jbmouret commented Nov 23, 2015

fedeallocati commented Nov 23, 2015

Aneoshun commented Nov 24, 2015

fedeallocati commented Nov 25, 2015

fedeallocati commented Nov 26, 2015

costashatz commented Nov 27, 2015

fedeallocati commented Nov 27, 2015

jbmouret commented Nov 27, 2015

jbmouret commented Nov 27, 2015

fedeallocati commented Nov 27, 2015

Accuracy

Speed

jbmouret commented Nov 27, 2015

fedeallocati commented Nov 27, 2015

fedeallocati commented Nov 29, 2015

Accuracy

Speed

costashatz commented Dec 17, 2015

jbmouret commented Dec 17, 2015

costashatz commented Dec 17, 2015

costashatz commented Jul 23, 2016