[MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor #7997

amanp10 · 2016-12-07T09:32:21Z

It adds the Embarrassingly Parallel helper in GaussianProcessRegressor class.

I have used 'check_pickle=False' for the 'delayed' function as it was not working otherwise. However, I am not sure about the logic behind this.

jnothman · 2016-12-07T09:54:46Z

Please modify tests to use the new parameter and check results are identical

…

On 7 December 2016 at 20:32, Aman Pratik ***@***.***> wrote: Fixes #7987 <#7987> It adds the Embarrassingly Parallel helper in GaussianProcessRegressor class. I have used 'check_pickle=False' for the 'delayed' function as it was not working otherwise. However, I am not sure about the logic behind this. ------------------------------ You can view, comment on, or merge this pull request online at: #7997 Commit Summary - Issue #7987 File Changes - *M* sklearn/gaussian_process/gpr.py <https://github.com/scikit-learn/scikit-learn/pull/7997/files#diff-0> (17) Patch Links: - https://github.com/scikit-learn/scikit-learn/pull/7997.patch - https://github.com/scikit-learn/scikit-learn/pull/7997.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7997>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6-lp-9_ddIZjNdPilRR9XAEkL2YEks5rFn0ngaJpZM4LGX5B> .

lesteve · 2016-12-07T13:00:10Z

sklearn/gaussian_process/gpr.py

                    theta_initial = \
                        self.rng.uniform(bounds[:, 0], bounds[:, 1])
                    optima.append(
                        self._constrained_optimization(obj_func, theta_initial,
                                                       bounds))
+                Parallel(n_jobs=self.n_jobs)(delayed(optima_iterations,
+                                                     check_pickle=False)()


I would be really surprised if using an inner function like this was working even with check_pickle=False for n_jobs != 1... You will need to move optima_iterations to the module scope (or class-level maybe).

Drawing random numbers in parallel processes seems a bit dangerous too. And I don't think you will be able to match the result of n_jobs=1 with n_jobs!=-1 this way. An alternative would be to draw all the random numbers first and then pass them to the function inside Parallel.

Optional: refactoring optima_iterations into a pure function that does not take self as argument is probably a good idea to reduce surprising behaviours.

To illustrate what I mean:

from joblib import Parallel, delayed import numpy as np def func(rng): return rng.randn(2) rng = np.random.RandomState(0) n_jobs=2 result = Parallel(n_jobs=n_jobs)(delayed(func)(rng) for x in [1, 2, 3]) print(result)

with n_jobs=1:

[array([ 1.76405235, 0.40015721]), array([ 0.97873798, 2.2408932 ]), array([ 1.86755799, -0.97727788])]

with n_jobs=2

[array([ 1.76405235, 0.40015721]), array([ 1.76405235, 0.40015721]), array([ 1.76405235, 0.40015721])]

What happens in the n_jobs=2 case is that each subprocess gets a copy of rng so the random numbers are the same for each iteration.

… mybranch

amanp10 · 2016-12-07T19:45:28Z

I have made the updates suggested by @jnothman . Please have a look.
Also, according to @lesteve I have changed the scope of optima_iterations to the Class and also taken care of the random numbers issue. Please have a look.
After your approval I will change the Title to [MRG] from [WIP]

jnothman · 2016-12-08T01:50:17Z

MRG means "ready for full review before merging" so you should change that now.

lesteve · 2016-12-08T08:19:51Z

sklearn/gaussian_process/gpr.py

+        self.n_jobs = n_jobs
+
+    def _optima_iterations(self, optima, bounds, theta_initial, obj_func):
+        optima.append(


How can that work ? If you read my previous #7997 (comment), when n_jobs!=1, each worker gets a copy of optima and this should not change optima in the main process ...

You should return self._constrained_optimizations(...) and do the optima.append in fit.

Just a snippet to illustrate

from sklearn.externals.joblib import Parallel, delayed def append(my_list, x): my_list.append(x) if __name__ == '__main__': my_list = [] Parallel(n_jobs=1)(delayed(append)(my_list, x) for x in range(10)) print('my_list:', my_list) my_list_2 = [] Parallel(n_jobs=2)(delayed(append)(my_list_2, x) for x in range(10)) print('my_list_2', my_list_2)

Output:

my_list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] my_list_2 []

lesteve · 2016-12-08T08:20:19Z

sklearn/gaussian_process/gpr.py

-                                                       bounds))
+                    theta_initials.append(theta_initial)
+                Parallel(n_jobs=self.n_jobs)(delayed(self._optima_iterations,
+                                                     check_pickle=False)


Is check_pickle=False needed?

It gives an error "TypeError: can't pickle instancemethod objects" when check_pickle=True.
Hence I did otherwise. Please let me know if it is wrong.

lesteve · 2016-12-08T08:21:11Z

sklearn/gaussian_process/tests/test_gpr.py

+    # Test to check the functioning of n_jobs parameter.
+    for kernel in kernels:
+        gpr1 = GaussianProcessRegressor(kernel=kernel, n_jobs=1).fit(X, y)
+        gpr2 = GaussianProcessRegressor(kernel=kernel, n_jobs=-1).fit(X, y)


Nitpick, use n_jobs=2 makes it easier to spot the difference compared to 1.

amanp10 · 2016-12-08T08:34:05Z

I will look into these issues and get back to you asap.

lesteve · 2016-12-08T09:40:40Z

I will look into these issues and get back to you asap.

I forgot to say, this is slightly worrying that the test was passing ... it is probably too forgiving.

… mybranch

amanp10 · 2016-12-08T18:05:44Z

@lesteve I have made the changes you suggested, however some tests are failing (I have mentioned them in the test file). Please go through the changes and give your reviews. Also, I have made few changes in 2 example files to test this new parameter. Thank you for devoting so much of your time.

jnothman · 2016-12-08T21:22:22Z

sklearn/gaussian_process/gpr.py

@@ -107,6 +108,11 @@ def optimizer(obj_func, initial_theta, bounds):
        given, it fixes the seed. Defaults to the global numpy random
        number generator.

+    n_jobs : int, default: 1
+        n_jobs is the number of workers requested by the callers.


Please copy the docstring from within scikit-learn, rather than from joblib. This text is a bit obscure.

jnothman · 2016-12-08T21:22:26Z

sklearn/gaussian_process/gpr.py

-                        self._constrained_optimization(obj_func, theta_initial,
-                                                       bounds))
+                    theta_initials.append(theta_initial)
+                Parallel(n_jobs=self.n_jobs)(delayed(self._optima_iterations,


there are a few ways you could make this line wrapping more readable! please find one...

jnothman · 2016-12-08T21:23:04Z

sklearn/gaussian_process/gpr.py

@@ -209,12 +220,16 @@ def obj_func(theta, eval_gradient=True):
                        "Multiple optimizer restarts (n_restarts_optimizer>0) "
                        "requires that all bounds are finite.")
                bounds = self.kernel_.bounds
-                for iteration in range(self.n_restarts_optimizer):
+                theta_initials = []
+                for i in range(self.n_restarts_optimizer):


use the size parameter of rng.uniform to generate n_restarts_optimizer elements in one fast operation

… mybranch

amanp10 · 2016-12-09T05:16:53Z

I have made the changes @jnothman suggested above.

amanp10 · 2016-12-10T08:59:06Z

Hello,
I am not able to understand why some of the tests are failing. Any kind of help is welcome.

def test_n_jobs_parallel():
# Test to check the functioning of n_jobs parameter.
for kernel in kernels:
gpr1 = GaussianProcessRegressor(kernel=kernel, n_jobs=1,
n_restarts_optimizer=5).fit(X, y)
gpr2 = GaussianProcessRegressor(kernel=kernel, n_jobs=2,
n_restarts_optimizer=5).fit(X, y)
gpr3 = GaussianProcessRegressor(kernel=kernel, n_jobs=-1,
n_restarts_optimizer=5).fit(X, y)
y1, y1_cov = gpr1.predict(X, return_cov=True)
y2, y2_cov = gpr2.predict(X, return_cov=True)
y3, y3_cov = gpr3.predict(X, return_cov=True)
# Successfully passed tests
assert_almost_equal(y1, y2)
assert_almost_equal(y1, y3)
assert_almost_equal(y1_cov, y2_cov)
assert_almost_equal(y1_cov, y3_cov)
# Failing tests
assert_almost_equal(gpr1.alpha_, gpr2.alpha_)
assert_almost_equal(gpr1.alpha_, gpr3.alpha_)
assert_almost_equal(gpr1.log_marginal_likelihood_value_,
gpr2.log_marginal_likelihood_value_)
assert_almost_equal(gpr1.log_marginal_likelihood_value_,
gpr3.log_marginal_likelihood_value_)

I also tried decreasing the precision value (required decimal places). This problem is arising for few kernels only.

… mybranch

jnothman · 2016-12-13T21:14:01Z

(Meaningful commit messages would help us know what to expect when we review changes.)

lesteve · 2016-12-14T13:46:25Z

examples/gaussian_process/plot_gpr_noisy.py

@@ -39,7 +39,8 @@
 kernel = 1.0 * RBF(length_scale=100.0, length_scale_bounds=(1e-2, 1e3)) \
    + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1))
 gp = GaussianProcessRegressor(kernel=kernel,
-                              alpha=0.0).fit(X, y)
+                              alpha=0.0, n_jobs=2,
+                              n_restarts_optimizer=3).fit(X, y)


Is there a particular reason why you modified the examples?

Note: n_jobs!=1 is not really recommended in examples because it will cause a problem in Windows if we do not use if __name__ == '__main__'.

No particular reason. I will remove it.

General guideline: if there is not a strong reason behind a change, just keep it as it was before. The more focused the changes, the easier and more pleasant it is to review them.

I will surely take care of it from now on.

amanp10 · 2016-12-14T18:30:26Z

According to @jnothman I used _constrained_optimization directly without using any other function. I also used @staticmethod decorator for _constrained_optimization.
However, when check_pickle=True , I get the following error,
TypeError: can't pickle function objects
There is a workaround for this using extrenal libraries but I am not sure whether to use it here.

Also, when backend=default, I get the following error,
pickle.PicklingError: Can't pickle <function _constrained_optimization at 0x7fdac5486230>: it's not found as sklearn.gaussian_process.gpr._constrained_optimization
Using an external function instead of _constrained_optimization might solve the issue.
What should I do next?

… mybranch

amanp10 · 2016-12-24T10:04:52Z

Should I add a benchmark test for different n_jobs parameter with backend='threading' for now?

jnothman · 2016-12-25T10:50:28Z

Yes, benchmark with backend='threading'

jnothman · 2016-12-25T10:52:54Z

But it's also extremely easy to pull _constraint_optimization out as a function of (optimizer, obj_func, initial_theta, bounds) such that it is importable/picklable. However, obj_func remains unpicklable, and optimizer may be, so I think we're better off requiring that the threading backend only is used, and that check_pickle=False.

… mybranch

amanp10 · 2016-12-28T04:00:49Z

I have added the Benchmark test but its failing. I have used python timeit, however it seems that with n_jobs=1 the fit method executes faster than with n_jobs=2 and n_jobs=-1.

jnothman · 2016-12-28T05:11:01Z

Yes, that'll happen if there's not enough time spent in parts of the code where the GIL is released. What shape of dataset are you doing that benchmark over? I suspect that indeed, unless this were rewritten in Cython, we'd not be releasing the GIL enough. I have a feeling that @lesteve's request for a benchmark was well justified, and we will find no gains here, unless we can avoid closures and use multiprocessing.

…

On 28 December 2016 at 15:00, Aman Pratik ***@***.***> wrote: I have added the Benchmark test but its failing. I have used python timeit, however it seems that with n_jobs=1 the fit method executes faster than with n_jobs=2 and n_jobs=-1. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7997 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz65nxT2sEAZOufdRuV7uEP_rttvtVks5rMd7ygaJpZM4LGX5B> .

amanp10 · 2016-12-28T09:49:57Z

The shapes were X->(6, 1) and y->(6,).
What do you suggest I do?

jnothman · 2016-12-28T13:37:27Z

oh well, that is very small for a benchmark. benchmarks teens to use something closer to a realistic but large sample size, since performance on small datasets rarely matters.

…

On 28 Dec 2016 8:50 pm, "Aman Pratik" ***@***.***> wrote: The shapes were X->(6, 1) and y->(6,). What do you suggest I do? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7997 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz60lw7UtP7iKeg4uWOAKqMd2ib3KDks5rMjDHgaJpZM4LGX5B> .

amanp10 · 2017-01-03T14:36:20Z

What do you suggest I do next? I used the Boston House dataset but the tests were still failing. Should I try changing the function scopes so as to use backend="multiprocessing"?

jnothman · 2017-01-04T08:47:49Z

Oh, I'd not realised you'd added tests for benchmarking. Usually we don't do this, but build a benchmarking script and upload it as a gist, or if we want to be able to repeat a benchmark over time, contribute it to the benchmarks/ directory in scikit-learn.

While this means we only test one user's platform at a time, it means we can do things like flexibly plot multiple benchmarks.

jnothman · 2017-01-04T08:48:57Z

I'd rather see plots showing the size of the effect rather than just a passed test (which may fail if the machine is having a bad day, which is no good for continuous integration)

… mybranch

amanp10 · 2017-01-05T07:23:35Z

I have added the Benchmark test in the benchmarks/ folder. Please have a look.

jnothman · 2017-01-06T05:20:25Z

I can only see n_jobs != 1 being an improvement on the 5th and to a lesser extent 6th kernels. (Though your measurements are not very robust as you make only one timing of each case.)

The gains so far seem to be quite marginal, and unless we can find a reasonable kernel or dataset size in which the gains are greater, perhaps this work is not of sufficient utility to merge :\

amanp10 · 2017-01-11T03:47:49Z

@jnothman What do you suggest me to do? Should I look for more kernels or datasizes? Or should I stop working on this issue?

jnothman · 2017-01-11T05:27:57Z

I suspect we're not going to get anywhere with this. But if you have a moment to try different sizes, etc, why not?

…

On 11 January 2017 at 14:47, Aman Pratik ***@***.***> wrote: @jnothman <https://github.com/jnothman> What do you suggest me to do? Should I look for more kernels or datasizes? Or should I stop working on this issue? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7997 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz63mZMail7TVE8olCBxPLSWwGM0XRks5rRFDmgaJpZM4LGX5B> .

amanp10 · 2017-01-12T04:28:46Z

I tried using different datasizes but it was not of any use. I dont have much knowledge about kernels so I will not be working on this issue unless advised otherwise.

jnothman · 2017-01-12T04:43:04Z

I think I'll close this, then.

…

On 12 January 2017 at 15:28, Aman Pratik ***@***.***> wrote: I tried using different datasizes but it was not of any use. I dont have much knowledge about kernels so I will not be working on this issue unless advised otherwise. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7997 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz66SeS070X0N4vi4sfdfAVgzLvWCGks5rRawAgaJpZM4LGX5B> .

jnothman · 2017-01-12T04:43:16Z

Sorry about the wasted effort. I hope you learnt from it, at least :|

…

On 12 January 2017 at 15:43, Joel Nothman ***@***.***> wrote: I think I'll close this, then. On 12 January 2017 at 15:28, Aman Pratik ***@***.***> wrote: > I tried using different datasizes but it was not of any use. I dont have > much knowledge about kernels so I will not be working on this issue unless > advised otherwise. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#7997 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AAEz66SeS070X0N4vi4sfdfAVgzLvWCGks5rRawAgaJpZM4LGX5B> > . >

amanp10 · 2017-01-12T04:46:32Z

I did learn a lot. Thanks to you @jnothman and @lesteve

Issue scikit-learn#7987

52259d0

lesteve reviewed Dec 7, 2016

View reviewed changes

amanp10 added 3 commits December 7, 2016 23:06

tests_gpr

e7de53b

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

6fd22cb

… mybranch

Issue scikit-learn#7987 2

c7e1cc2

amanp10 changed the title ~~[MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor~~ [WIP] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor Dec 7, 2016

amanp10 changed the title ~~[WIP] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor~~ [MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor Dec 8, 2016

lesteve reviewed Dec 8, 2016

View reviewed changes

amanp10 added 2 commits December 8, 2016 23:23

Issue scikit-learn#7987 3

232289f

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

c28f5be

… mybranch

jnothman reviewed Dec 8, 2016

View reviewed changes

amanp10 added 3 commits December 9, 2016 10:36

Issue scikit-learn#7987 4

62116db

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

ee5cb16

… mybranch

Issue scikit-learn#7987 5

91bc68b

amanp10 added 5 commits December 11, 2016 16:56

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

39e7633

… mybranch

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

04a9aef

… mybranch

Issue#7987 6

b5f53ca

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

17e4c92

… mybranch

Issue#7897 7

f2a97a1

lesteve reviewed Dec 14, 2016

View reviewed changes

amanp10 added 3 commits December 15, 2016 10:04

Issue#7987 examples as before, whats_new.rst changed

ab498e6

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

f053b97

… mybranch

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

cbc7d1b

… mybranch

amanp10 added 2 commits December 28, 2016 09:21

Added Timing Benchmark Test(Failing)

f244985

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

e5b21fd

… mybranch

amanp10 added 2 commits January 5, 2017 09:21

Benchmark test added in Benchmark/ folder

702817a

Merge branch 'master' of https://github.com/amanp10/scikit-learn into…

c98a15d

… mybranch

jnothman closed this Jan 12, 2017

jnothman mentioned this pull request Jan 12, 2017

Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor #7987

Closed

amanp10 deleted the mybranch branch January 29, 2017 17:51

[MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor #7997

[MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor #7997

Conversation

amanp10 commented Dec 7, 2016

jnothman commented Dec 7, 2016 via email

Choose a reason for hiding this comment

lesteve Dec 7, 2016 • edited

Choose a reason for hiding this comment

amanp10 commented Dec 7, 2016

jnothman commented Dec 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amanp10 commented Dec 8, 2016

lesteve commented Dec 8, 2016

amanp10 commented Dec 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amanp10 commented Dec 9, 2016

amanp10 commented Dec 10, 2016 • edited

jnothman commented Dec 13, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amanp10 commented Dec 14, 2016 • edited

amanp10 commented Dec 24, 2016 • edited

jnothman commented Dec 25, 2016

jnothman commented Dec 25, 2016 • edited

amanp10 commented Dec 28, 2016

jnothman commented Dec 28, 2016 via email

amanp10 commented Dec 28, 2016

jnothman commented Dec 28, 2016 via email

amanp10 commented Jan 3, 2017

jnothman commented Jan 4, 2017

jnothman commented Jan 4, 2017

amanp10 commented Jan 5, 2017

jnothman commented Jan 6, 2017

amanp10 commented Jan 11, 2017

jnothman commented Jan 11, 2017 via email

amanp10 commented Jan 12, 2017

jnothman commented Jan 12, 2017 via email

jnothman commented Jan 12, 2017 via email

amanp10 commented Jan 12, 2017

lesteve Dec 7, 2016 •

edited

amanp10 commented Dec 10, 2016 •

edited

amanp10 commented Dec 14, 2016 •

edited

amanp10 commented Dec 24, 2016 •

edited

jnothman commented Dec 25, 2016 •

edited