Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Add clone_kernel option to make gaussian process models faster #14378

Merged
merged 6 commits into from Jul 31, 2019

Conversation

@c-bata
Copy link
Contributor

@c-bata c-bata commented Jul 15, 2019

Reference Issues/PRs

None

What does this implement/fix? Explain your changes.

Problems

I want to make GaussianProcessRegressor faster. I profiled GaussianProcessRegressor#fit using cProfile and generated the graph to find bottlenecks. The result is below:

output

This result shows Kernel#clone_with_theta accounts for 55.18% of the whole.

Solutions

Basically, I think we don't need to clone kernels. It seems to be enough that we just replace theta attribute. From my benchmark of this gist, GaussianProcessRegressor#fit will be 26.4% faster by applying this patch.

Before:

$ python examples/gpr_bench.py 
elapsed: 3.620s
elapsed: 4.616s
elapsed: 3.892s
elapsed: 3.784s
elapsed: 3.908s
elapsed: 4.280s
elapsed: 2.884s
elapsed: 2.543s
elapsed: 2.502s
elapsed: 4.946s
score: 3.6975975036621094 0.7911395580877181

After:

$ python examples/gpr_bench.py 
elapsed: 2.642s
elapsed: 2.423s
elapsed: 2.921s
elapsed: 2.704s
elapsed: 2.856s
elapsed: 3.667s
elapsed: 2.737s
elapsed: 2.397s
elapsed: 2.292s
elapsed: 2.590s
score: 2.722917938232422 0.36802930790866245

Any other comments?

To confirm this PR doesn't break the logic, I generate the graph of model prediction results (I might need to design more suitable kernel functions).

before after (exactly the same with before's one)
gpr_predict gpr_predict2
@c-bata c-bata changed the title Make fitting of GaussianProcessRegressor faster Make a fitting of GaussianProcessRegressor faster Jul 16, 2019
Copy link
Member

@rth rth left a comment

Thanks for investigating @c-bata ! I went through the GPR code, and couldn't see a place (e.g. with inplace modification of the kernel) that would require cloning it.

Cloning the kernel with an 1 element array as theta takes 130μs,

In [2]: import numpy as np                                                                                                                     
In [3]: from sklearn.gaussian_process.kernels import RBF                                                                                                                                                                                                                      
In [4]: kernel = RBF()                                                                                                                         
In [5]: %timeit kernel.clone_with_theta(np.arange(1))                                                                                          
134 µs ± 1.48 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

and as far as I could tell there were on the order of 50 L-BFGS-B iterations, the actual number of loss function evaluations is likely higher. I imagine this mostly helps with run time when n_features is low.

Please also make this change to gpc.py.

It could help if you could also check that results before and after this change are identical not visually but with a numpy.testing.assert_allclose (even if we don't add it to tests).

Loading

@c-bata
Copy link
Contributor Author

@c-bata c-bata commented Jul 16, 2019

Thanks for reviewing! I'll apply this change to gpc.py 👍

It could help if you could also check that results before and after this change are identical not visually but with a numpy.testing.assert_allclose (even if we don't add it to tests).

I confirmed numpy.testing.assert_allclose doesn't raise AssertionError in following code:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.gaussian_process import kernels as sk_kern
from sklearn.gaussian_process import GaussianProcessRegressor


def objective(x):
    return x + 20 * np.sin(x)


def main():
    kernel = sk_kern.RBF(1.0, (1e-3, 1e3)) + sk_kern.ConstantKernel(1.0, (1e-3, 1e3))
    clf = GaussianProcessRegressor(
        kernel=kernel,
        alpha=1e-10,
        optimizer="fmin_l_bfgs_b",
        n_restarts_optimizer=20,
        normalize_y=True)

    np.random.seed(0)

    x_train = np.random.uniform(-20, 20, 200)
    y_train = objective(x_train) + np.random.normal(loc=0, scale=.1, size=x_train.shape)

    clf.fit(x_train.reshape(-1, 1), y_train)

    x_test = np.linspace(-20., 20., 200).reshape(-1, 1)
    pred_mean, pred_std = clf.predict(x_test, return_std=True)

    # save pred_mean and pred_std
    #np.save("pred_mean", pred_mean)
    #np.save("pred_std", pred_std)

    # assertion
    orig_mean = np.load("pred_mean.npy")
    orig_std = np.load("pred_std.npy")
    np.testing.assert_allclose(orig_mean, pred_mean)
    np.testing.assert_allclose(orig_std, pred_std)


if __name__ == '__main__':
    main()

Loading

@c-bata
Copy link
Contributor Author

@c-bata c-bata commented Jul 16, 2019

I added the patch for GaussianProcessClassifier.

The benchmark and the results are below. It seems GaussianProcessClassifier will be 25.2% faster in the benchmark of fitting the iris dataset.
https://gist.github.com/c-bata/2c7c8aa08852b5bc05bea37f33cd1962

I also confirmed np.testing.assert_allclose doesn't raise AssertionError when asserting the results of prediction probabilities of GPC.
https://gist.github.com/c-bata/2c7c8aa08852b5bc05bea37f33cd1962#file-gpc_bench-py-L28-L29

Loading

@c-bata c-bata changed the title Make a fitting of GaussianProcessRegressor faster Avoid cloning kernel objects of GaussianProcess models Jul 16, 2019
@c-bata c-bata changed the title Avoid cloning kernel objects of GaussianProcess models [MRG] Avoid cloning kernel objects of GaussianProcess models Jul 16, 2019
rth
rth approved these changes Jul 17, 2019
Copy link
Member

@rth rth left a comment

Thanks @c-bata !

LGTM, but a second thorough review would be helpful.

Loading

@adrinjalali
Copy link
Member

@adrinjalali adrinjalali commented Jul 18, 2019

log_marginal_likelihood() is a public function, and you're changing the behavior of the function here.

This PR changes the output of the third line here:

gpr.fit(X, y)
gpr.log_marginal_likelihood(theta=another_value)
gpr.log_marginal_likelihood()

You could maybe add a parameter like clone_kernel=True to log_marginal_likelihood, and pass False during fit, I guess.

Loading

@c-bata
Copy link
Contributor Author

@c-bata c-bata commented Jul 19, 2019

Thanks for reviewing, @adrinjalali !
To confirm whether the output of the third line is changed, I executed following script. But the output of the third line didn't be changed.
https://gist.github.com/c-bata/e6210a345a8431fbe293a725d1f4cc16

Could you tell me the case that the output of the third line will be changed? If there is that, I'll push the following change on this branch.
https://github.com/c-bata/scikit-learn/compare/faster-gpr...c-bata:faster-gpr-with-option?expand=1

Loading

@adrinjalali
Copy link
Member

@adrinjalali adrinjalali commented Jul 19, 2019

I haven't checked why the output doesn't change, but if it doesn't that's another issue with the code which we need to investigate. Your proposed patch looks good, please commit them here and we'll continue from there.

Loading

@c-bata
Copy link
Contributor Author

@c-bata c-bata commented Jul 19, 2019

I see. I pushed 0a98cf3 on this branch.

Loading

sklearn/gaussian_process/gpr.py Outdated Show resolved Hide resolved
Loading
@rth rth added Needs work and removed Needs work labels Jul 25, 2019
@c-bata
Copy link
Contributor Author

@c-bata c-bata commented Jul 28, 2019

I'll squash the commits and rebase on the master branch

Loading

@c-bata c-bata changed the title [MRG] Avoid cloning kernel objects of GaussianProcess models [MRG] Add clone_kernel option to make gaussian process models faster Jul 28, 2019
sklearn/gaussian_process/gpc.py Outdated Show resolved Hide resolved
Loading
sklearn/gaussian_process/gpc.py Outdated Show resolved Hide resolved
Loading
sklearn/gaussian_process/gpr.py Outdated Show resolved Hide resolved
Loading
Co-Authored-By: Adrin Jalali <adrin.jalali@gmail.com>
@@ -318,6 +320,10 @@ def log_marginal_likelihood(self, theta=None, eval_gradient=False):
to the kernel hyperparameters at position theta is returned
additionally. If True, theta must not be None.
clone_kernel : bool, default: True
If True, the kernel attribute is copied. Please specify False if
you want to make this method call faster.
Copy link
Member

@adrinjalali adrinjalali Jul 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like:

            If True, the kernel attribute is copied. If False, the kernel
            attribute is modified, but may result in a performance improvement.

Loading

Copy link
Contributor Author

@c-bata c-bata Jul 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll change to it!

Loading

clone_kernel : bool, default: True
If True, the kernel attribute is copied. Please specify False if
you want to make this method call faster.
Copy link
Member

@adrinjalali adrinjalali Jul 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments as above

Loading

@adrinjalali
Copy link
Member

@adrinjalali adrinjalali commented Jul 28, 2019

Also, please avoid force pushing the commits, we squash them before merge anyway :)

Loading

@c-bata
Copy link
Contributor Author

@c-bata c-bata commented Jul 30, 2019

Hi, @adrinjalali! Thanks for your review.
I pushed the commit to apply your comments :)

Loading

@adrinjalali
Copy link
Member

@adrinjalali adrinjalali commented Jul 30, 2019

Please add an entry to the change log at doc/whats_new/v*.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.

Loading

and :func:`gaussian_process.GaussianProcessRegressor.log_marginal_likelihood` now
accept a ``clone_kernel=True`` keyword argument. When set to ``False``,
the kernel attribute is modified, but may result in a performance improvement.
:func:`gaussian_process.GaussianProcessClassifier.fit` and
Copy link
Member

@adrinjalali adrinjalali Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need lines 144-146

Loading

Copy link
Contributor Author

@c-bata c-bata Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! I'll remove 144-146. BTW, do you know the reason why doc-min-dependencies job is failed at CircleCI?

Loading

Copy link
Member

@adrinjalali adrinjalali Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't worry about the circleCI, we know about it. Conda has removed some old packages, that's why it fails.

Loading

Copy link
Contributor Author

@c-bata c-bata Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it 👍 Thank you!

Loading

Copy link
Member

@adrinjalali adrinjalali left a comment

LGTM, pinging @rth for a second approval.

Loading

rth
rth approved these changes Jul 31, 2019
@rth
Copy link
Member

@rth rth commented Jul 31, 2019

Thanks @c-bata !

Loading

@rth rth merged commit ec9b11b into scikit-learn:master Jul 31, 2019
14 of 17 checks passed
Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants