Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PESM acquisition function takes many times longer than PES #103

Closed
martinsimon88 opened this issue Apr 23, 2017 · 19 comments
Closed

PESM acquisition function takes many times longer than PES #103

martinsimon88 opened this issue Apr 23, 2017 · 19 comments

Comments

@martinsimon88
Copy link

martinsimon88 commented Apr 23, 2017

Hello,

I would like to check if it is something to be expected that if I use PESM, it can take many hours to do 50 suggestions, while PES can do it in ca 10 minutes? or maybe it's one of the dependencies causing it? I had some trouble setting up PyGMO and NLOpt on Ubuntu 16.04 and downgraded my system to Ubuntu 14.04 on which everything seemed to compile fine.

Thanks!
Martin

@danielhernandezlobato
Copy link
Collaborator

Hello,

that is strange. Are you trying the toy problem located at Spearmint/examples/moo/? If not, what is the dimensionality of your problem?

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 24, 2017 via email

@danielhernandezlobato
Copy link
Collaborator

In my laptop it takes 10 seconds to do an iteration, so 50 should not take longer than 10 minutes. Can you check what is the bottle neck in your case. The computations done are:

Step 1 - Run 10 times EP until convergence (one per each sample of the hyper-parameters)
Step 2 - Optimize the acquisition.

Ideally you should see on the screen 10 runs of EP until convergence. Each line is an EP iteration. After running EP, the acquisition function is optimized. I do not have NLopt installed, so in my case scipy is used instead.

@martinsimon88
Copy link
Author

Hello! I uninstalled NLopt, but still runs ca 1min 30 sec per iteration.
I guess most of the time it spends on Step 1. It prints the following ca 10 times:
Computing PESM for branin_1, branin_2 1: change=1.829144 damping: 0.500000 2: change=2.487438 damping: 0.495000 3: change=0.252376 damping: 0.490050 4: change=0.045401 damping: 0.485149 5: change=0.049822 damping: 0.480298 6: change=0.034507 damping: 0.475495 7: change=0.032269 damping: 0.470740 8: change=0.039005 damping: 0.466033 9: change=0.034097 damping: 0.461372 10: change=0.029572 damping: 0.456759 11: change=0.024536 damping: 0.452191 12: change=0.020833 damping: 0.447669 13: change=0.017341 damping: 0.443192 14: change=0.014655 damping: 0.438761 15: change=0.012268 damping: 0.434373 16: change=0.010359 damping: 0.430029 17: change=0.008712 damping: 0.425729 18: change=0.007362 damping: 0.421472 19: change=0.006215 damping: 0.417257 20: change=0.005261 damping: 0.413084 21: change=0.004455 damping: 0.408953 22: change=0.003779 damping: 0.404864 23: change=0.003209 damping: 0.400815 24: change=0.002729 damping: 0.396807 25: change=0.002323 damping: 0.392839 26: change=0.001980 damping: 0.388911 27: change=0.001690 damping: 0.385022 28: change=0.001444 damping: 0.381171 29: change=0.001235 damping: 0.377360 30: change=0.001058 damping: 0.373586 31: change=0.000907 damping: 0.369850 ....................................................................................................... .......................................................................................................

@danielhernandezlobato
Copy link
Collaborator

Ok, my machine is a Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz, which could be slightly faster. Are you running spearmint into a virtual machine? That may degrade performance. The only thing I can think of is the library for linear algebra operations. Since the bottleneck of EP is matrix operations, using a better linear algebra library may give you better results in terms of perormance. I have installed openblas. You may try to install openblas and then recompile numpy and scipy so that they use that library for matrix operations. Here there is some info about how to do that:

https://leemendelowitz.github.io/blog/installing-numpy-with-openblas.html

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 25, 2017 via email

@martinsimon88
Copy link
Author

OpenBLAS did not accelerate the speed. Still ca 1-2 mins per iteration. What else might it be?

Thanks,
Martin

@danielhernandezlobato
Copy link
Collaborator

Ok, I would suggest some tests to check where is the bottleneck.

You can you check how much time does it take EP to converge in the first run (normally there are 10 runs of EP done at each iteration of spearmint). You simply have to edit the file

spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py

go to line 357 to the definition of the "ep" function. Then, introduce these lines before the beginning of the while loop located at line 433

import time
start = time.time()

Then, at the end of the while loop (line 495) add these lines with the same indent as line 496.

end = time.time()
import pdb; pdb.set_trace()

Run spearmint in the toy problem. It will stop at the breakpoint after a run of EP. There, you can type

end - start

which will give you the time taken. In my case it gives this output:

dani@dani-HP-OMEN:/tmp/Spearmint-PESM/examples/moo$ python /tmp/Spearmint-PESM/spearmint/main.py .
Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
/usr/local/lib/python2.7/dist-packages/IPython/kernel/init.py:13: ShimWarning: The IPython.kernel package has been deprecated. You should import from ipykernel or jupyter_client instead.
"You should import from ipykernel or jupyter_client instead.", ShimWarning)
Fitting GP to 1 data for branin_1 task...
Fitting GP to 1 data for branin_2 task...

Solving Multi-objective global optimization of posterior means!

Global Optimization finished. Size of the pareto set: 1 opt. 1 observed.

Getting suggestion for branin_1, branin_2...

Computing PESM for branin_1, branin_2
1: change=0.398697 damping: 0.500000
2: change=1.001192 damping: 0.495000
3: change=0.196313 damping: 0.490050
4: change=0.130776 damping: 0.485149
5: change=0.063683 damping: 0.480298
6: change=0.030000 damping: 0.475495
7: change=0.019682 damping: 0.470740
8: change=0.023561 damping: 0.466033
9: change=0.030155 damping: 0.461372
10: change=0.036471 damping: 0.456759
11: change=0.038597 damping: 0.452191
12: change=0.038628 damping: 0.447669
13: change=0.036680 damping: 0.443192
14: change=0.033220 damping: 0.438761
15: change=0.028888 damping: 0.434373
16: change=0.024297 damping: 0.430029
17: change=0.019908 damping: 0.425729
18: change=0.015989 damping: 0.421472
19: change=0.012651 damping: 0.417257
20: change=0.009901 damping: 0.413084
21: change=0.007688 damping: 0.408953
22: change=0.005938 damping: 0.404864
23: change=0.004805 damping: 0.400815
24: change=0.004033 damping: 0.396807
25: change=0.003335 damping: 0.392839
26: change=0.002146 damping: 0.388911
27: change=0.001885 damping: 0.385022
28: change=0.001622 damping: 0.381171
29: change=0.001022 damping: 0.377360
30: change=0.001374 damping: 0.373586
31: change=0.001717 damping: 0.369850
32: change=0.001883 damping: 0.366152
33: change=0.001576 damping: 0.362490
34: change=0.001265 damping: 0.358865
35: change=0.000974 damping: 0.355277

/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7-linux-x86_64.egg/spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py(504)ep()
-> return a
(Pdb) end - start
0.25279808044433594
(Pdb)

So you can see that EP only takes 0.25 seconds to converge in my laptop. What is obtained in your case?

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 27, 2017 via email

@danielhernandezlobato
Copy link
Collaborator

Ok, it is slower but not too much. Can you check now what is the time taken to evaluate the acquisition on the grid of points that is used to solve the optimization problem? You would only need to do something similar. In particular, in the original file (that is, before the changes made before)

spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py

before line 2967 can you add the code

import time
start = time.time()

and then, after line 2968 can you add the code:

end = time.time()
if cand.shape[ 0 ] > 1:
import pdb; pdb.set_trace()

That will stop spearmint after the evaluation of the acquisition on the grid of points and will measure the time taken. In particular, I get this result:

dani@dani-HP-OMEN:/tmp/Spearmint-PESM/examples/moo$ python /tmp/Spearmint-PESM/spearmint/main.py .
Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
/usr/local/lib/python2.7/dist-packages/IPython/kernel/init.py:13: ShimWarning: The IPython.kernel package has been deprecated. You should import from ipykernel or jupyter_client instead.
"You should import from ipykernel or jupyter_client instead.", ShimWarning)

Getting suggestion for branin_1, branin_2...

Suggestion:
NAME TYPE VALUE
---- ---- -----
X float 3.803711
Y float 12.583008
Submitted job 1 for tasks(s) branin_1, branin_2 with local scheduler (process id: 4912).
Current time: 2017-04-27 14:16:22
Status: 1 pending, 0 complete.
ID(s) of pending job(s) for branin_1, branin_2: 1
Waiting for results...

Fitting GP to 1 data for branin_1 task...
Fitting GP to 1 data for branin_2 task...

Solving Multi-objective global optimization of posterior means!

Global Optimization finished. Size of the pareto set: 1 opt. 1 observed.

Getting suggestion for branin_1, branin_2...

Computing PESM for branin_1, branin_2
1: change=0.978531 damping: 0.500000
2: change=1.038956 damping: 0.495000
3: change=0.609380 damping: 0.490050
4: change=0.224851 damping: 0.485149
5: change=0.969918 damping: 0.480298
6: change=0.233555 damping: 0.475495
7: change=0.089802 damping: 0.470740
8: change=0.061536 damping: 0.466033
9: change=0.040302 damping: 0.461372
10: change=0.035127 damping: 0.456759
11: change=0.029982 damping: 0.452191
12: change=0.027289 damping: 0.447669
13: change=0.025151 damping: 0.443192
14: change=0.023600 damping: 0.438761
15: change=0.021691 damping: 0.434373
16: change=0.019590 damping: 0.430029
17: change=0.017464 damping: 0.425729
18: change=0.015408 damping: 0.421472
19: change=0.013485 damping: 0.417257
20: change=0.011728 damping: 0.413084
21: change=0.010152 damping: 0.408953
22: change=0.008756 damping: 0.404864
23: change=0.007532 damping: 0.400815
24: change=0.006467 damping: 0.396807
25: change=0.005546 damping: 0.392839
26: change=0.004753 damping: 0.388911
27: change=0.004072 damping: 0.385022
28: change=0.003489 damping: 0.381171
29: change=0.002991 damping: 0.377360
30: change=0.002565 damping: 0.373586
31: change=0.002202 damping: 0.369850
32: change=0.001891 damping: 0.366152
33: change=0.001626 damping: 0.362490
34: change=0.001400 damping: 0.358865
35: change=0.001207 damping: 0.355277
36: change=0.001041 damping: 0.351724
37: change=0.000900 damping: 0.348207
.......................................................................................................
.......................................................................................................
1.48276294016

/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7-linux-x86_64.egg/spearmint/acquisition_functions/predictive_entropy_search_multiobjective.py(2980)acquisition()
-> if tasks is None:
(Pdb) end - start
0.3585538864135742
(Pdb)

That is. It takes 0.35 seconds to evaluate the acquisition on the grid of points, of size about 1,000 points.

@danielhernandezlobato
Copy link
Collaborator

By the way, there should be a tabulator before import pdb; pdb.set_trace(). I do not why the indent is wrong in the previous post.

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 27, 2017 via email

@danielhernandezlobato
Copy link
Collaborator

Ok, can you check the time that it takes to optimize the acquisition function using scipy and LBGS. You only have to add the following lines to the file

spearmint/choosers/default_chooser.py

Before line 820 add

import time
start = time.time()

after line 821 add

end = time.time()
import pdb; pdb.set_trace()

In my case I get

.
.
.
.

13: change=0.021149 damping: 0.443192
14: change=0.019065 damping: 0.438761
15: change=0.016963 damping: 0.434373
16: change=0.014914 damping: 0.430029
17: change=0.012975 damping: 0.425729
18: change=0.011187 damping: 0.421472
19: change=0.009573 damping: 0.417257
20: change=0.008141 damping: 0.413084
21: change=0.006889 damping: 0.408953
22: change=0.005808 damping: 0.404864
23: change=0.018193 damping: 0.400815
24: change=0.011807 damping: 0.396807
25: change=0.008542 damping: 0.392839
26: change=0.006537 damping: 0.388911
27: change=0.005136 damping: 0.385022
28: change=0.004086 damping: 0.381171
29: change=0.003271 damping: 0.377360
30: change=0.002628 damping: 0.373586
31: change=0.002117 damping: 0.369850
32: change=0.001710 damping: 0.366152
33: change=0.001383 damping: 0.362490
34: change=0.001122 damping: 0.358865
35: change=0.000912 damping: 0.355277
.......................................................................................................
.......................................................................................................
1.1683031283
Optimizing PESM with L-BFGS (numerically estimating gradients)

/usr/local/lib/python2.7/dist-packages/spearmint-0.1-py2.7-linux-x86_64.egg/spearmint/choosers/default_chooser.py(830)compute_acquisition_function()
-> y_opt = -y_opt
(Pdb) end - start
2.172369956970215
(Pdb)

That is, it takes 2.17 seconds to optimize the acquisition function to determine the next evaluation point.

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 27, 2017 via email

@danielhernandezlobato
Copy link
Collaborator

Ok, for some reason it seems you computer is about two times slower than mine. However, that is strange since your processor seems to be as good as mine. The only reason I can think for that is the use of openblas. May be you installed openblas but did not compile properly numpy and scipy so that in the end openblas is not used in practice.

I think PESM can be slightly more expensive than PESC because for each new point on which the acquisition function has to be evaluated, PESC updates factors that only change the marginals of the GPs (which are independent), and hence no matrix inverses have to be computed. However, in the case of PESM you have to update several factors that introduce correlations among the values of the functions for the new point and the values of the function for the points corresponding to the pareto optimal points (typically there are 50 of those at most). To obtain the new updated marginals after refining the factors, PESM must invert a matrix, which is expensive.

This may explain why PESM is slower than PESC. However, they should not be that different that the running time of one is in the order of minutes and the other is in the order of hours.

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 27, 2017 via email

@danielhernandezlobato
Copy link
Collaborator

Ok. I think you will have to code that yourself. You can check how spearmint uses mongodb, for example, by looking at the file generate_hypervolumes.py in the example and in main.py.

@danielhernandezlobato
Copy link
Collaborator

By the way. I think that by default openblas uses several threads to do the matrix computations. That is useful when working with huge matrices, but deteriorates the performance when the matrices are rather small, because of the overhead of creating and synchronizing threads. You can check if using only a single thread improves the results. You only have to export these variables for that:

export OPENBLAS_NUM_THREADS=1
export GOTO_NUM_THREADS=1
export OMP_NUM_THREADS=1
export OPENBLAS_MAIN_FREE=1

@martinsimon88
Copy link
Author

martinsimon88 commented Apr 28, 2017

Awesome! It is definitely faster after exporting the variables. It takes my machine ca 25 mins to finish the /moo example case, which matches the differences we compared.

I have experienced something similar in Computational Physics solvers when running problems on one processor is faster. Thanks!

Also, thanks for the tip for generate_hypervolumes.py!

Edit: Do you think the one-processor solution will have a reverse effect when I increase the amount of input variables? or when it runs for e.g 200 iterations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants