Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare the results of limbo with BayesOPT #13

Closed
jbmouret opened this issue Oct 16, 2015 · 30 comments
Closed

Compare the results of limbo with BayesOPT #13

jbmouret opened this issue Oct 16, 2015 · 30 comments
Assignees

Comments

@jbmouret
Copy link
Member

https://bitbucket.org/rmcantin/

Criteria:

  • accuracy (do we find the global optimum?)
  • speed (in how much time?)
@fedeallocati
Copy link
Member

The first thing I tried is the simplest scenario possible, optimizing the Branin function:

  • Both using a plain GP, with no hyperparameters optimization
  • Kernel Matern 5/2, with l = 1 and sigma = 1
  • Constant mean = 1
  • Initialization with 10 uniform random samples
  • Acquisition UCB(LCB for BayesOpt, they minimize) with alpha = 0.125
  • For the optimization of the acquisition, in both cases used NLOpt's DIRECT algorithm, with 500 iterations

To be fair, I had to remove some tricks that BayesOpt perform. For example, when optimizing the acquisition, they perform first a global search with DIRECT, and then a shorter local search with BOBYQA; also, after this optimization, they perform some random variations of the current minimum sample and then a small local search around it; another trick is that if the difference between two consecutive samples(squared) is less than the noise for N iterations (N configured by the user) then it performs a totally random sample. Maybe there are more tricks that I haven't seen, but I think this are the main ones.
I compiled both binaries with the same GCC, and no MKL for limbo. I'm performing 50000 runs of each in the cluster, so it will take some time. I will comment later with preliminary results.

@costashatz
Copy link
Member

The first thing I tried is the simplest scenario possible, optimizing the Branin function:

Both using a plain GP, with no hyperparameters optimization
Kernel Matern 5/2, with l = 1 and sigma = 1
Constant mean = 1
Initialization with 10 uniform random samples
Acquisition UCB(LCB for BayesOpt, they minimize) with alpha = 0.125
For the optimization of the acquisition, in both cases used NLOpt's DIRECT algorithm, with 500 iterations
To be fair, I had to remove some tricks that BayesOpt perform. For example, when optimizing the acquisition, they perform first a global search with DIRECT, and then a shorter local search with BOBYQA; also, after this optimization, they perform some random variations of the current minimum sample and then a small local search around it; another trick is that if the difference between two consecutive samples(squared) is less than the noise for N iterations (N configured by the user) then it performs a totally random sample. Maybe there are more tricks that I haven't seen, but I think this are the main ones.
I compiled both binaries with the same GCC, and no MKL for limbo. I'm performing 50000 runs of each in the cluster, so it will take some time. I will comment later with preliminary results.

Really curious to see what the results will be.

@Aneoshun
Copy link
Member

Hi,

I am curious about the results too.
And I am quite surprised to see that Bayes Opt use so many tricks.

However, be careful as your benchmark should consider the “vanilla” version of the lib.
While, it’s definitely interesting to see if without all these tricks both libs find the same results and if our implementation is more efficient that BayesOpt, we need also to know which lib finds the best results or which libs is the most efficient (speed) in their standard form.
I agree that you should set all the “user parameters” to the same values, but you should also keep the tricks provided by BayesOpt.
These tricks may improve the accuracy of the optimization, but they will also penalize its speed.

Typically, a normal user of the libs will not go into the source code and remove these tricks (as you did), so the performances that we to compare are those that this user is likely to observe.

Do you see what I mean?

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 16:27, Konstantinos Chatzilygeroudis notifications@github.com wrote:

The first thing I tried is the simplest scenario possible, optimizing the Branin function:

Both using a plain GP, with no hyperparameters optimization
Kernel Matern 5/2, with l = 1 and sigma = 1
Constant mean = 1
Initialization with 10 uniform random samples
Acquisition UCB(LCB for BayesOpt, they minimize) with alpha = 0.125
For the optimization of the acquisition, in both cases used NLOpt's DIRECT algorithm, with 500 iterations
To be fair, I had to remove some tricks that BayesOpt perform. For example, when optimizing the acquisition, they perform first a global search with DIRECT, and then a shorter local search with BOBYQA; also, after this optimization, they perform some random variations of the current minimum sample and then a small local search around it; another trick is that if the difference between two consecutive samples(squared) is less than the noise for N iterations (N configured by the user) then it performs a totally random sample. Maybe there are more tricks that I haven't seen, but I think this are the main ones.
I compiled both binaries with the same GCC, and no MKL for limbo. I'm performing 50000 runs of each in the cluster, so it will take some time. I will comment later with preliminary results.

Really curious to see what the results will be.


Reply to this email directly or view it on GitHub #13 (comment).

@fedeallocati
Copy link
Member

Hi Antoine,
I totally agree with you, but first wanted to have a "base" comparison of both libraries competing at equality of conditions. Then, the next steps will be to start adding this tricks, and enabling different functionalities.

@Aneoshun
Copy link
Member

You are completely right. I look forward to see the results!

Cheers,

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 16:43, Federico Allocati notifications@github.com wrote:

Hi Antoine,
I totally agree with you, but first wanted to have a "base" comparison of both libraries competing at equality of conditions. Then, the next steps will be to start adding this tricks, and enabling different functionalities.


Reply to this email directly or view it on GitHub #13 (comment).

@fedeallocati
Copy link
Member

Preliminary results with ~2100 runs

Accuracy

In Branin there are 3 points global optima, and for numeric reasons they give similar results but not exactly the same:
(0.1238938,0.81833299999999998) - > 0.39788735776433271
(0.54277280000000006,0.151667) -> 0.39788735775195327
(0.96165199999999995,0.16500000000000001) -> 0.39788735775266204

To compensate for this, I outputted the absolute difference between the best observation and the value at the closests global optimum to it.

limbo bayesopt
Samples 2164.000 2158.000
Upper whisker 7.699362e-03 7.222849e-05
3rd quartile 3.107246e-03 2.920610e-05
Median 2.377636e-04 2.699465e-06
1st quartile 4.583516e-05 5.245012e-07
Lower whisker 1.145036e-07 3.402310e-10

p value: 4.109789e-229

Here there is a picture zoomed in at the important part:
branin_simple_accuracy

Note that both they and us have some outliers sometimes, which sometimes go up to a value near 11. Here you can appreciate them:
branin_simple_accuracy_outliers

We can see that they perform a little better (medians 0.0002377636 vs 0.000002699465)

Speed

I measured the runtime, including the time to instantiate the classes and to run the optimization. In this case, limbo clearly wins (but let's not get excited, I think it's because in this scenario the kernel optimization is turned off).

limbo bayesopt
Count 2164.000 2158.000
Upper whisker 1.329500e+03 3.509500e+03
3rd quartile 1.268000e+03 3.214000e+03
Median 1.239000e+03 3.129000e+03
1st quartile 1.227000e+03 3.017000e+03
Lower whisker 1.165500e+03 2.721500e+03

Weirdly, I get a p-value of 0 (maybe just for numerical reasons).

branin_simple_speed

In this settings we run almost 3 times faster than them. The two main reasons could be that they use Boost uBlas and we use Eigen, and that they have a super complex object oriented architecture, with lots of inheritance, and we have the optimized static classes.

After the weekend I will upload the full run.

@Aneoshun
Copy link
Member

Thank you for these preliminary results.

Do you have an idea about why BayesOpt performs better than Limbo in terms of accuracy?
Is it because of the inner-optimization process? Or because of numerical reasons?

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 16:51, Federico Allocati notifications@github.com wrote:

Preliminary results with ~2100 runs

Accuracy

In Branin there are 3 points global optima, and for numeric reasons they give similar results but not exactly the same:
(0.1238938,0.81833299999999998) - > 0.39788735776433271
(0.54277280000000006,0.151667) -> 0.39788735775195327
(0.96165199999999995,0.16500000000000001) -> 0.39788735775266204

To compensate for this, I outputted the absolute difference between the best observation and the value at the closests global optimum to it.

limbo bayesopt
Samples 2164.000 2158.000
Upper whisker 7.699362e-03 7.222849e-05
3rd quartile 3.107246e-03 2.920610e-05
Median 2.377636e-04 2.699465e-06
1st quartile 4.583516e-05 5.245012e-07
Lower whisker 1.145036e-07 3.402310e-10
p value: 4.109789e-229

Here there is a picture zoomed in at the important part:
https://cloud.githubusercontent.com/assets/7495532/11305362/97f4260c-8fac-11e5-9f1d-a56bb3b38f86.png
Note that both they and us have some outliers sometimes, which sometimes go up to a value near 11. Here you can appreciate them:
https://cloud.githubusercontent.com/assets/7495532/11305433/eed9fe06-8fac-11e5-8971-afeec0fa4aad.png
We can see that they perform a little better (medians 0.0002377636 vs 0.000002699465)

Speed

I measured the runtime, including the time to instantiate the classes and to run the optimization. In this case, limbo clearly wins (but let's not get excited, I think it's because in this scenario the kernel optimization is turned off).

limbo bayesopt
Count 2164.000 2158.000
Upper whisker 1.329500e+03 3.509500e+03
3rd quartile 1.268000e+03 3.214000e+03
Median 1.239000e+03 3.129000e+03
1st quartile 1.227000e+03 3.017000e+03
Lower whisker 1.165500e+03 2.721500e+03
Weirdly, I get a p-value of 0 (maybe just for numerical reasons).

https://cloud.githubusercontent.com/assets/7495532/11305788/918b8dda-8fae-11e5-9bc1-4b6f922d5482.png
In this settings we run almost 3 times faster than them. The two main reasons could be that they use Boost uBlas and we use Eigen, and that they have a super complex object oriented architecture, with lots of inheritance, and we have the optimized static classes.

After the weekend I will upload the full run.


Reply to this email directly or view it on GitHub #13 (comment).

@fedeallocati
Copy link
Member

Honestly, I have no idea. I'm using the exact same inner-optimization process for both (DIRECT algorithm from NLOpt library, I used the nlopt branch of limbo, with some modifications to match BayesOpt).

@Aneoshun
Copy link
Member

Hum,.. interesting.

Except the random initialization of the GP, which parts of the algorithm(s) are stochastic?
DIRECT is deterministic isn’t it?

I am asking this, because if all the algorithms are deterministic then we can compare them step by step and see when the behave differently.
So, if they are both deterministic (except, the initialization), I can suggest to change the initialization procedure to make something deterministic and common for both algorithms (e.g., you can provide manually the 10 samples) and then compare step by step which sample they are selecting and which values they are receiving. Theoretically, we should have identical outputs for both of the algorithms. If it’s not the case, then you can infer the source of the difference. If the sample selection differ, then it may be in the kernel inversion (need to be more investigated afterwards), and if it’s the returned values that differ, then it’s probably a numerical reasons.

Do the two algorithms use different inversion procedures? (if our inversion is less “exact” then it may also explain why limbo is faster)

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 17:06, Federico Allocati notifications@github.com wrote:

Honestly, I have no idea. I'm using the exact same inner-optimization process for both (DIRECT algorithm from NLOpt library, I used the nlopt branch of limbo, with some modifications to match BayesOpt).


Reply to this email directly or view it on GitHub #13 (comment).

@fedeallocati
Copy link
Member

I think that DIRECT is deterministic, and the only point of stochasticness is the initialization. I will try next week providing the samples. With limbo is super easy, but for BayesOpt I have to dig deeper in the code to modify it.
Also, what we do is use Eigen's Cholesky decomposition, recomputing it fully at each step, while they only add the new row to it, and I guess they use somo uBlas functionality for this, but haven't checked so deeply the library to see exactly how they do it, just some high-level calls. Anyways, what you say makes sense, and this could be the main point of difference.

@Aneoshun
Copy link
Member

In the case where there is no hyper-parameter optimization, then the kernel of the next iteration contains the kernel of the current one (and only adds one line and one row)
In such circumstances, you can use the blockwise inversion theorem (https://en.wikipedia.org/wiki/Invertible_matrix#Blockwise_inversion).
It’s typically, what I used for the blacklisting of samples.
I don’t know if it’s exactly what they do (mainly because it does not work with likelihood optimization) but this may lead to a better kernel inversion (it always relies on the previous inversion and it’s exact, contrary to most of technics that inverse large matrices from scratch).

Enjoy your week end!

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 20, 2015, at 17:39, Federico Allocati notifications@github.com wrote:

I think that DIRECT is deterministic, and the only point of stochasticness is the initialization. I will try next week providing the samples. With limbo is super easy, but for BayesOpt I have to dig deeper in the code to modify it.
Also, what we do is use Eigen's Cholesky decomposition, recomputing it fully at each step, while they only add the new row to it, and I guess they use somo uBlas functionality for this, but haven't checked so deeply the library to see exactly how they do it, just some high-level calls. Anyways, what you say makes sense, and this could be the main point of difference.


Reply to this email directly or view it on GitHub #13 (comment).

@fedeallocati
Copy link
Member

Yeap, we know it, and discussed it briefly, but to do that we would have to change the API of the GP to allow incremental updates of the model, and like you said, when there is optimization of the hyperparameters, everything gets recomputed, so we think it's not really worth it.

What they do is they perform incremental updates of the cholesky decomposition, and every N iterations(configurable by the user) they optimize the hyperparameters, and recompute the full decomposition. In this case, since there is no hyperaparameter optimization, they always perform the incremental updates, which, as you said, may lead to greater stability.

Good weekend for you!

@fedeallocati
Copy link
Member

Hello! For some reason, not all jobs ran in the cluster, and I ended up with only 17743 samples, but I think they are more than enough. They just confirm what I posted previously:

Accuracy

limbo bayesopt
Count 17743.000 17743.000
Upper whisker 9.538656e-03 7.846090e-05
3rd quartile 3.846090e-03 3.171025e-05
Median 2.611418e-04 2.584184e-06
1st quartile 5.104675e-05 5.431500e-07
Lower whisker 1.599025e-08 2.161293e-11

branin_simple_accuracy

Speed

limbo bayesopt
Count 17743.000 17743.000
Upper whisker 1.371000e+03 3.568000e+03
3rd quartile 1.287000e+03 3.244000e+03
Median 1.245000e+03 3.150000e+03
1st quartile 1.231000e+03 3.028000e+03
Lower whisker 1.147000e+03 2.704000e+03

branin_simple_speed

In these days I will continue benchamarking.

@jbmouret
Copy link
Member Author

Thank you. This is very interesting. What is your stopping criterion?

On Mon, Nov 23, 2015 at 10:53 AM Federico Allocati notifications@github.com
wrote:

Hello! For some reason, not all jobs ran in the cluster, and I ended up
with only 17743 samples, but I think they are more than enough. They just
confirm what I posted previously:
Accuracy limbo bayesopt Count 17743.000 17743.000 Upper whisker
9.538656e-03 7.846090e-05 3rd quartile 3.846090e-03 3.171025e-05 Median
2.611418e-04 2.584184e-06 1st quartile 5.104675e-05 5.431500e-07 Lower
whisker 1.599025e-08 2.161293e-11

[image: branin_simple_accuracy]
https://cloud.githubusercontent.com/assets/7495532/11333591/1cff1bf8-91cf-11e5-8007-3d3fe382ce42.png
Speed limbo bayesopt Count 17743.000 17743.000 Upper whisker 1.371000e+03
3.568000e+03 3rd quartile 1.287000e+03 3.244000e+03 Median 1.245000e+03
3.150000e+03 1st quartile 1.231000e+03 3.028000e+03 Lower whisker
1.147000e+03 2.704000e+03

[image: branin_simple_speed]
https://cloud.githubusercontent.com/assets/7495532/11333615/51e69832-91cf-11e5-8879-92c5e217bdb7.png

In these days I will continue benchamarking.


Reply to this email directly or view it on GitHub
#13 (comment).

@fedeallocati
Copy link
Member

Forgot to write it, 190 iterations in both cases

@Aneoshun
Copy link
Member

Hi all,

Do you have more explanations about the accuracy difference ?

Best,

Antoine CULLY
Research associate in Robotics and Artificial Intelligence
Imperial College London
Personal Robotics Lab
http://pages.isir.upmc.fr/~cully/website/ http://pages.isir.upmc.fr/~cully/website/

On Nov 23, 2015, at 10:41, Federico Allocati notifications@github.com wrote:

Forgot to write it, 190 iterations in both cases


Reply to this email directly or view it on GitHub #13 (comment).

@fedeallocati
Copy link
Member

I have been working on other things, and will resume this tomorrow.

@fedeallocati
Copy link
Member

I just tried both libraries, in the same scenario, but both starting with the same set of points (taken from here: http://mathematica.stackexchange.com/questions/47638/distribution-of-10-points-within-a-unit-square), and they give really different results. Limbo is able to find a suitable solution (with approx the same accuracy as before), while BayesOpt doesn't get it, and got stuck in a local optimum.
Since I removed all randomness, it doesn't make sense to run them multiple times (I checked just in case, running a few times each).
It is clear that we have some difference somewhere, whether by the recomputing of the cholesky decomposition on each step, by the computing itself of cholesky (we don't know how Eigen does it, and if it is the same that BayesOpt is doing; while in theory they should be the same, maybe some of the methods is more precise than the other one), or by numerical differences introduced by Eigen vs Boost uBlas. Do you think it's worth the effort in digging deeper on this?

Also, from now there are many things to try and paths to follow, so I propose that we stablish a "Roadmap" for the comparisons:

  1. Add all the benchmark functions from bo_functions, to have a wider panorama
  2. I will add another variable to the analysis, which is the speed in terms of iterations to find the minimum (with an appropiate tolerance value). As I discussed with JB, this is usually more important than the computational speed
  3. Recompute the previous analyisis with this new variable
  4. Restore BayesOpt tricks and compare again?
  5. ?

I'm waiting for suggestions on what scenarios are worth comparing to you. Should this scenarios be always fair? (for example, by default they don't recompute the hyperparameters every step, but every 50 I think, so should we set this parameter to 1, or leave the default?). Please write any ideas that you have!

@costashatz
Copy link
Member

Add all the benchmark functions from bo_functions, to have a wider panorama
I will add another variable to the analysis, which is the speed in terms of iterations to find the minimum (with an appropiate tolerance value). As I discussed with JB, this is usually more important than the computational speed
Recompute the previous analyisis with this new variable

I agree with this.

Restore BayesOpt tricks and compare again

I think this is also a good idea.

Should this scenarios be always fair? (for example, by default they don't recompute the hyperparameters every step, but every 50 I think, so should we set this parameter to 1, or leave the default?)

I think we need to test the scenarios once with fair comparison and once with the defaults of each library.

@fedeallocati
Copy link
Member

Okay people, I've got some news that explain the difference in accuracy and speed.
It was my fault, I didn't noticed that when performing the inner optimization, BayesOpt multiplies the number of iterations you set by the dimension of the function, so in the last case, they were running the double of iterations than us.
I'm running again, with 8 different functions to optimize, and taking also into account the dimension of the functions, and will upload the results (just 5000 runs of each, it would take too much time otherwise).
Some preliminary runs show that now we are more or less in the same order of magnitude regarding to accuracy (in some function we win by little, in others they do, but a really small difference), and that we are still faster(although not as much as before, less than 2 times faster than them).
From the theoretical point of view, it doesn't make sense that they are slower, so I think that now the difference is purely from Eigen vs Boost uBlas.
In the weekend I will try to upload the results.

@jbmouret
Copy link
Member Author

Thank you. I am looking forward seeing the results!
� JBM

On 27 Nov 2015, at 19:33, Federico Allocati notifications@github.com wrote:

Okay people, I've got some news that explain the difference in accuracy and speed.
It was my fault, I didn't noticed that when performing the inner optimization, BayesOpt multiplies the number of iterations you set by the dimension of the function, so in the last case, they were running the double of iterations than us.
I'm running again, with 8 different functions to optimize, and taking also into account the dimension of the functions, and will upload the results (just 5000 runs of each, it would take too much time otherwise).
Some preliminary runs show that now we are more or less in the same order of magnitude regarding to accuracy (in some function we win by little, in others they do, but a really small difference), and that we are still faster(although not as much as before, less than 2 times faster than them).
From the theoretical point of view, it doesn't make sense that they are slower, so I think that now the difference is purely from Eigen vs Boost uBlas.
In the weekend I will try to upload the results.


Reply to this email directly or view it on GitHub.

� Jean-Baptiste Mouret
http://pages.isir.upmc.fr/~mouret/

@jbmouret
Copy link
Member Author

I agree with Konstantinos on all of this.

� JBM
On 27 Nov 2015, at 11:13, Konstantinos Chatzilygeroudis notifications@github.com wrote:

Add all the benchmark functions from bo_functions, to have a wider panorama
I will add another variable to the analysis, which is the speed in terms of iterations to find the minimum (with an appropiate tolerance value). As I discussed with JB, this is usually more important than the computational speed
Recompute the previous analyisis with this new variable

I agree with this.

Restore BayesOpt tricks and compare again

I think this is also a good idea.

Should this scenarios be always fair? (for example, by default they don't recompute the hyperparameters every step, but every 50 I think, so should we set this parameter to 1, or leave the default?)

I think we need to test the scenarios once with fair comparison and once with the defaults of each library.


Reply to this email directly or view it on GitHub.

� Jean-Baptiste Mouret
http://pages.isir.upmc.fr/~mouret/

@fedeallocati
Copy link
Member

Just so you can check, results with 1000 runs of each:

Accuracy

limbo sphere bayesopt sphere limbo ellipsoid bayesopt ellipsoid limbo branin bayesopt branin
Count 1000.000 1000.000 1000.000 1000.000 1000.000 1000.000
Upper whisker 1.563614e-08 1.145400e-07 1.763106e-02 1.750175e-02 8.314348e-03 5.072667e-05
3rd quartile 6.624448e-09 4.887992e-08 7.128387e-03 7.003101e-03 3.350847e-03 2.052798e-05
Median 2.752702e-09 1.854793e-08 7.612957e-04 1.321474e-04 2.352545e-04 2.037751e-06
1st quartile 6.166560e-10 5.106548e-09 1.266081e-04 4.004099e-06 4.184631e-05 3.955210e-07
Lower whisker 9.710000e-15 8.630000e-15 2.897569e-08 4.341416e-11 5.723248e-08 4.764828e-10
limbo hartman3 bayesopt hartman3 limbo hartman6 bayesopt hartman6 limbo rastrigin bayesopt rastrigin
Count 1000.000 1000.000 1000.000 1000.000 1000.000 1000.000
Upper whisker 1.367687e-03 8.651926e-05 3.424624e-01 2.984132e-01 3.500000e+00 3.500000e+00
3rd quartile 5.772985e-04 8.228316e-05 1.370208e-01 1.193896e-01 2.000000e+00 2.000000e+00
Median 7.675129e-05 8.204481e-05 2.327436e-03 9.921574e-04 1.999996e+00 2.000000e+00
1st quartile 5.037282e-05 7.945910e-05 5.969094e-05 4.057478e-05 9.999998e-01 1.000000e+00
Lower whisker 6.459865e-07 7.522300e-05 2.949481e-08 1.299830e-07 0.000000e+00 0.000000e+00
limbo goldenprice bayesopt goldenprice limbo sixhumpcamel bayesopt sixhumpcamel
Count 1000.000 1000.000 1000.000 1000.000
Upper whisker 3.985235e-02 5.634791e+00 3.162881e-02 3.162843e-02
3rd quartile 1.632651e-02 2.302588e+00 3.162859e-02 3.162842e-02
Median 2.411001e-03 3.928288e-05 3.162850e-02 3.162842e-02
1st quartile 6.426208e-04 2.224865e-06 3.162845e-02 3.162842e-02
Lower whisker 4.258365e-07 2.183231e-11 3.162843e-02 3.162842e-02

Speed

limbo sphere bayesopt sphere limbo ellipsoid bayesopt ellipsoid limbo branin bayesopt branin
Count 5000.000 1156.000 5000.000 1150.000 5000.000 1183.000
Upper whisker 2.421500e+03 3.532250e+03 2.682000e+03 3.800875e+03 2.718000e+03 4.144000e+03
3rd quartile 2.003000e+03 3.256250e+03 2.019000e+03 3.334000e+03 1.995000e+03 3.367000e+03
Median 1.959000e+03 3.197000e+03 1.964000e+03 3.249000e+03 1.939000e+03 3.284000e+03
1st quartile 1.724000e+03 3.072250e+03 1.577000e+03 3.022750e+03 1.513000e+03 2.838000e+03
Lower whisker 1.305500e+03 2.796250e+03 1.189000e+03 2.555875e+03 1.153000e+03 2.044500e+03
limbo hartman3 bayesopt hartman3 limbo hartman6 bayesopt hartman6 limbo rastrigin bayesopt rastrigin
Count 5000.000 1165.000 5000.000 1176.000 5000.000 1159.000
Upper whisker 3.659375e+03 5.070500e+03 7.302000e+03 1.097400e+04 4.767000e+03 5.867000e+03
3rd quartile 2.915000e+03 4.754000e+03 5.747250e+03 9.321250e+03 3.790000e+03 5.570000e+03
Median 2.842000e+03 4.653000e+03 5.620000e+03 9.114000e+03 3.702000e+03 5.477000e+03
1st quartile 2.418750e+03 4.543000e+03 4.579250e+03 8.109500e+03 3.108750e+03 5.372000e+03
Lower whisker 1.674375e+03 4.226500e+03 3.370000e+03 6.291875e+03 2.253000e+03 5.075000e+03
limbo goldenprice bayesopt goldenprice limbo sixhumpcamel bayesopt sixhumpcamel
Count 5000.000 1148.000 5000.000 1148.000
Upper whisker 2.656000e+03 4.071000e+03 2.576500e+03 3.724750e+03
3rd quartile 2.006000e+03 3.332000e+03 1.981000e+03 3.049000e+03
Median 1.956000e+03 3.251500e+03 1.926500e+03 2.990000e+03
1st quartile 1.551000e+03 2.787750e+03 1.584000e+03 2.598500e+03
Lower whisker 1.194000e+03 1.971375e+03 1.162000e+03 1.922750e+03

@jbmouret
Copy link
Member Author

Thank you.

How do you measure accuracy? Is it the difference with the real optimum? If so, it seems that both lib fail for Rastrigin� (which is not unexpected).

A count of 1000 means 1000 iterations?? If so, this is a lot..

@fedeallocati
Copy link
Member

Exactly, accuracy is the difference with the real optimum. The count is the
number of runs of the executables.
El nov. 27, 2015 23:28, "JB Mouret" notifications@github.com escribió:

Thank you.

How do you measure accuracy? Is it the difference with the real optimum?
If so, it seems that both lib fail for Rastrigin� (which is not unexpected).

A count of 1000 means 1000 iterations?? If so, this is a lot..


Reply to this email directly or view it on GitHub
#13 (comment).

@fedeallocati
Copy link
Member

I've got the results with 5000 runs. It didn't changed much from the previous data:

Accuracy

limbo sphere bayesopt sphere limbo ellipsoid bayesopt ellipsoid limbo branin bayesopt branin
3rd quartile 6.532603e-09 5.217534e-08 7.719718e-03 7.002796e-03 4.920966e-03 2.790338e-05
Median 2.327818e-09 1.813649e-08 8.095894e-04 1.387372e-04 3.051706e-04 2.461750e-06
1st quartile 5.443109e-10 4.554303e-09 1.277711e-04 3.032496e-06 5.276195e-05 5.423679e-07
limbo hartman3 bayesopt hartman3 limbo hartman6 bayesopt hartman6 limbo rastrigin bayesopt rastrigin
3rd quartile 6.067495e-04 8.228137e-05 1.266144e-01 1.193084e-01 2.994957e+00 2.991187e+00
Median 7.840647e-05 8.209100e-05 2.426417e-03 5.560084e-04 1.999999e+00 2.000000e+00
1st quartile 5.589862e-05 8.047203e-05 5.868768e-05 2.315586e-05 9.999998e-01 1.000000e+00
limbo goldenprice bayesopt goldenprice limbo sixhumpcamel bayesopt sixhumpcamel
3rd quartile 1.291762e-02 2.302586e+00 3.162859e-02 3.162842e-02
Median 2.074041e-03 2.584832e-05 3.162850e-02 3.162842e-02
1st quartile 5.411675e-04 1.780868e-06 3.162845e-02 3.162842e-02

I think it makes no sense to upload the plots for this, since each function has it's own scale, and is not really informative, but if someone wants to, I can put them.

As you can see, we have more or less the same accuracy as them, except for the branin function(in which they are 2 orders of magnitude better), the hartman6(1 order better) and the goldenprice(2 orders better), and the simple sphere in which we are 1 order of magnitude better.
If you want to check if this differences are produced by the stability of their cholesky decomp, tell me, and I can rerun their algorithm but recomputing each time the full cholesky.

Speed

limbo sphere bayesopt sphere limbo ellipsoid bayesopt ellipsoid limbo branin bayesopt branin
Median 1.959000e+03 3.188000e+03 1.964000e+03 3.246000e+03 1.939000e+03 3.277000e+03
limbo hartman3 bayesopt hartman3 limbo hartman6 bayesopt hartman6 limbo rastrigin bayesopt rastrigin
Median 2.842000e+03 4.639000e+03 5.620000e+03 9.082000e+03 3.702000e+03 5.456000e+03
limbo goldenprice bayesopt goldenprice limbo sixhumpcamel bayesopt sixhumpcamel
Median 1.956000e+03 3.256000e+03 1.926500e+03 2.994000e+03

simple_speed_1

simple_speed_2

simple_speed_3

Some good news for us, we are always faster! The plots are clear, we really beat them in this aspect, with them being in average over all the functions 1.61 times slower than us. I'm pretty sure that the difference will be even bigger if I set the recalculation of the whole cholesky matrix on each iteration.

@jbmouret jbmouret assigned jbmouret and fedeallocati and unassigned jbmouret Dec 12, 2015
@costashatz
Copy link
Member

Maybe after some bug fixes (like random_init) and the new cmaes library, we should re-run these experiments.

@jbmouret
Copy link
Member Author

We should be able to easily run these experiments so that we can check that there is no regression.

� JBM
On 17 Dec 2015, at 16:13, Konstantinos Chatzilygeroudis notifications@github.com wrote:

Maybe after some bug fixes (like random_init) and the new cmaes library, we should re-run these experiments.


Reply to this email directly or view it on GitHub.

� Jean-Baptiste Mouret
http://pages.isir.upmc.fr/~mouret/

@costashatz
Copy link
Member

We should be able to easily run these experiments so that we can check that there is no regression.

Good idea!

@costashatz
Copy link
Member

This is solved by #96 ... Closing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants