New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Issue #7987: Embarrassingly parallel "n_restarts_optimizer" in GaussianProcessRegressor #7997
Conversation
Please modify tests to use the new parameter and check results are identical
…On 7 December 2016 at 20:32, Aman Pratik ***@***.***> wrote:
Fixes #7987 <#7987>
It adds the Embarrassingly Parallel helper in GaussianProcessRegressor
class.
I have used 'check_pickle=False' for the 'delayed' function as it was not
working otherwise. However, I am not sure about the logic behind this.
------------------------------
You can view, comment on, or merge this pull request online at:
#7997
Commit Summary
- Issue #7987
File Changes
- *M* sklearn/gaussian_process/gpr.py
<https://github.com/scikit-learn/scikit-learn/pull/7997/files#diff-0>
(17)
Patch Links:
- https://github.com/scikit-learn/scikit-learn/pull/7997.patch
- https://github.com/scikit-learn/scikit-learn/pull/7997.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#7997>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAEz6-lp-9_ddIZjNdPilRR9XAEkL2YEks5rFn0ngaJpZM4LGX5B>
.
|
theta_initial = \ | ||
self.rng.uniform(bounds[:, 0], bounds[:, 1]) | ||
optima.append( | ||
self._constrained_optimization(obj_func, theta_initial, | ||
bounds)) | ||
Parallel(n_jobs=self.n_jobs)(delayed(optima_iterations, | ||
check_pickle=False)() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be really surprised if using an inner function like this was working even with check_pickle=False
for n_jobs != 1
... You will need to move optima_iterations to the module scope (or class-level maybe).
Drawing random numbers in parallel processes seems a bit dangerous too. And I don't think you will be able to match the result of n_jobs=1
with n_jobs!=-1
this way. An alternative would be to draw all the random numbers first and then pass them to the function inside Parallel.
Optional: refactoring optima_iterations into a pure function that does not take self as argument is probably a good idea to reduce surprising behaviours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To illustrate what I mean:
from joblib import Parallel, delayed
import numpy as np
def func(rng):
return rng.randn(2)
rng = np.random.RandomState(0)
n_jobs=2
result = Parallel(n_jobs=n_jobs)(delayed(func)(rng) for x in [1, 2, 3])
print(result)
with n_jobs=1:
[array([ 1.76405235, 0.40015721]), array([ 0.97873798, 2.2408932 ]), array([ 1.86755799, -0.97727788])]
with n_jobs=2
[array([ 1.76405235, 0.40015721]), array([ 1.76405235, 0.40015721]), array([ 1.76405235, 0.40015721])]
What happens in the n_jobs=2 case is that each subprocess gets a copy of rng
so the random numbers are the same for each iteration.
MRG means "ready for full review before merging" so you should change that now. |
self.n_jobs = n_jobs | ||
|
||
def _optima_iterations(self, optima, bounds, theta_initial, obj_func): | ||
optima.append( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can that work ? If you read my previous #7997 (comment), when n_jobs!=1, each worker gets a copy of optima
and this should not change optima
in the main process ...
You should return self._constrained_optimizations(...)
and do the optima.append
in fit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a snippet to illustrate
from sklearn.externals.joblib import Parallel, delayed
def append(my_list, x):
my_list.append(x)
if __name__ == '__main__':
my_list = []
Parallel(n_jobs=1)(delayed(append)(my_list, x) for x in range(10))
print('my_list:', my_list)
my_list_2 = []
Parallel(n_jobs=2)(delayed(append)(my_list_2, x) for x in range(10))
print('my_list_2', my_list_2)
Output:
my_list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
my_list_2 []
bounds)) | ||
theta_initials.append(theta_initial) | ||
Parallel(n_jobs=self.n_jobs)(delayed(self._optima_iterations, | ||
check_pickle=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is check_pickle=False needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gives an error "TypeError: can't pickle instancemethod objects" when check_pickle=True.
Hence I did otherwise. Please let me know if it is wrong.
# Test to check the functioning of n_jobs parameter. | ||
for kernel in kernels: | ||
gpr1 = GaussianProcessRegressor(kernel=kernel, n_jobs=1).fit(X, y) | ||
gpr2 = GaussianProcessRegressor(kernel=kernel, n_jobs=-1).fit(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, use n_jobs=2 makes it easier to spot the difference compared to 1.
I will look into these issues and get back to you asap. |
I forgot to say, this is slightly worrying that the test was passing ... it is probably too forgiving. |
@lesteve I have made the changes you suggested, however some tests are failing (I have mentioned them in the test file). Please go through the changes and give your reviews. Also, I have made few changes in 2 example files to test this new parameter. Thank you for devoting so much of your time. |
@@ -107,6 +108,11 @@ def optimizer(obj_func, initial_theta, bounds): | |||
given, it fixes the seed. Defaults to the global numpy random | |||
number generator. | |||
|
|||
n_jobs : int, default: 1 | |||
n_jobs is the number of workers requested by the callers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please copy the docstring from within scikit-learn, rather than from joblib. This text is a bit obscure.
self._constrained_optimization(obj_func, theta_initial, | ||
bounds)) | ||
theta_initials.append(theta_initial) | ||
Parallel(n_jobs=self.n_jobs)(delayed(self._optima_iterations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are a few ways you could make this line wrapping more readable! please find one...
@@ -209,12 +220,16 @@ def obj_func(theta, eval_gradient=True): | |||
"Multiple optimizer restarts (n_restarts_optimizer>0) " | |||
"requires that all bounds are finite.") | |||
bounds = self.kernel_.bounds | |||
for iteration in range(self.n_restarts_optimizer): | |||
theta_initials = [] | |||
for i in range(self.n_restarts_optimizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the size parameter of rng.uniform
to generate n_restarts_optimizer elements in one fast operation
I have made the changes @jnothman suggested above. |
Hello, def test_n_jobs_parallel(): I also tried decreasing the precision value (required decimal places). This problem is arising for few kernels only. |
(Meaningful commit messages would help us know what to expect when we review changes.) |
@@ -39,7 +39,8 @@ | |||
kernel = 1.0 * RBF(length_scale=100.0, length_scale_bounds=(1e-2, 1e3)) \ | |||
+ WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1)) | |||
gp = GaussianProcessRegressor(kernel=kernel, | |||
alpha=0.0).fit(X, y) | |||
alpha=0.0, n_jobs=2, | |||
n_restarts_optimizer=3).fit(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason why you modified the examples?
Note: n_jobs!=1
is not really recommended in examples because it will cause a problem in Windows if we do not use if __name__ == '__main__'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason. I will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General guideline: if there is not a strong reason behind a change, just keep it as it was before. The more focused the changes, the easier and more pleasant it is to review them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will surely take care of it from now on.
According to @jnothman I used Also, when backend=default, I get the following error, |
Should I add a benchmark test for different |
Yes, benchmark with backend='threading' |
But it's also extremely easy to pull |
I have added the Benchmark test but its failing. I have used python timeit, however it seems that with |
Yes, that'll happen if there's not enough time spent in parts of the code
where the GIL is released. What shape of dataset are you doing that
benchmark over? I suspect that indeed, unless this were rewritten in
Cython, we'd not be releasing the GIL enough.
I have a feeling that @lesteve's request for a benchmark was well
justified, and we will find no gains here, unless we can avoid closures and
use multiprocessing.
…On 28 December 2016 at 15:00, Aman Pratik ***@***.***> wrote:
I have added the Benchmark test but its failing. I have used python
timeit, however it seems that with n_jobs=1 the fit method executes
faster than with n_jobs=2 and n_jobs=-1.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7997 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz65nxT2sEAZOufdRuV7uEP_rttvtVks5rMd7ygaJpZM4LGX5B>
.
|
The shapes were X->(6, 1) and y->(6,). |
oh well, that is very small for a benchmark. benchmarks teens to use
something closer to a realistic but large sample size, since performance on
small datasets rarely matters.
…On 28 Dec 2016 8:50 pm, "Aman Pratik" ***@***.***> wrote:
The shapes were X->(6, 1) and y->(6,).
What do you suggest I do?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7997 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz60lw7UtP7iKeg4uWOAKqMd2ib3KDks5rMjDHgaJpZM4LGX5B>
.
|
What do you suggest I do next? I used the |
Oh, I'd not realised you'd added tests for benchmarking. Usually we don't do this, but build a benchmarking script and upload it as a gist, or if we want to be able to repeat a benchmark over time, contribute it to the While this means we only test one user's platform at a time, it means we can do things like flexibly plot multiple benchmarks. |
I'd rather see plots showing the size of the effect rather than just a passed test (which may fail if the machine is having a bad day, which is no good for continuous integration) |
I have added the Benchmark test in the |
I can only see n_jobs != 1 being an improvement on the 5th and to a lesser extent 6th kernels. (Though your measurements are not very robust as you make only one timing of each case.) The gains so far seem to be quite marginal, and unless we can find a reasonable kernel or dataset size in which the gains are greater, perhaps this work is not of sufficient utility to merge :\ |
@jnothman What do you suggest me to do? Should I look for more kernels or datasizes? Or should I stop working on this issue? |
I suspect we're not going to get anywhere with this. But if you have a
moment to try different sizes, etc, why not?
…On 11 January 2017 at 14:47, Aman Pratik ***@***.***> wrote:
@jnothman <https://github.com/jnothman> What do you suggest me to do?
Should I look for more kernels or datasizes? Or should I stop working on
this issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7997 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz63mZMail7TVE8olCBxPLSWwGM0XRks5rRFDmgaJpZM4LGX5B>
.
|
I tried using different datasizes but it was not of any use. I dont have much knowledge about kernels so I will not be working on this issue unless advised otherwise. |
I think I'll close this, then.
…On 12 January 2017 at 15:28, Aman Pratik ***@***.***> wrote:
I tried using different datasizes but it was not of any use. I dont have
much knowledge about kernels so I will not be working on this issue unless
advised otherwise.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7997 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz66SeS070X0N4vi4sfdfAVgzLvWCGks5rRawAgaJpZM4LGX5B>
.
|
Sorry about the wasted effort. I hope you learnt from it, at least :|
…On 12 January 2017 at 15:43, Joel Nothman ***@***.***> wrote:
I think I'll close this, then.
On 12 January 2017 at 15:28, Aman Pratik ***@***.***> wrote:
> I tried using different datasizes but it was not of any use. I dont have
> much knowledge about kernels so I will not be working on this issue unless
> advised otherwise.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#7997 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAEz66SeS070X0N4vi4sfdfAVgzLvWCGks5rRawAgaJpZM4LGX5B>
> .
>
|
Fixes #7987
It adds the Embarrassingly Parallel helper in GaussianProcessRegressor class.
I have used 'check_pickle=False' for the 'delayed' function as it was not working otherwise. However, I am not sure about the logic behind this.