{{ message }}

# Make return_std 20x faster in Gaussian Processes (includes solution) #9234

Closed
opened this issue Jun 27, 2017 · 10 comments
Closed

# Make return_std 20x faster in Gaussian Processes (includes solution)#9234

opened this issue Jun 27, 2017 · 10 comments
Labels

### andrewww commented Jun 27, 2017 • edited by TomDLT

 Two requests: (1) Please replace this line: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/gaussian_process/gpr.py#L329 from this ` yvar -= np.einsum("ki,kj,ij->k", K_trans, K_trans, K_inv)` to this ``` sum1 = np.dot(K_trans,K_inv).T yvar -= np.einsum("ki,ik->k", K_trans, sum1)``` For an input data set of size 800x1, the time difference is 12.7 seconds to 0.2 seconds. I have validated that the result is the same up to 1e-12 or smaller. (2) Please cache the result of the K_inv computation. It depends only on the result of training, and can be very costly for repeated calls to the class. The text was updated successfully, but these errors were encountered:

### andrewww commented Jun 27, 2017 • edited by TomDLT

 Complete solution, starting here https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/gaussian_process/gpr.py#L322 ``` if not hasattr(self,'K_inv_'): L_inv = solve_triangular(self.L_.T, np.eye(self.L_.shape[0])) self.K_inv_ = L_inv.dot(L_inv.T) # Compute variance of predictive distribution y_var = self.kernel_.diag(X) sum1 = np.dot(K_trans,self.K_inv_).T y_var1 = y_var - np.einsum("ki,ik->k", K_trans, sum1) # y_var2 = y_var - np.einsum("ki,kj,ij->k", K_trans, K_trans, self.K_inv_) # assert np.all(np.abs(y_var1-y_var2)<1e-12) y_var = y_var1```

### TomDLT commented Jun 27, 2017 • edited

 Thanks for the suggestion. Can you open a pull-request with this change? Note that the first part was already solved in #8591, but not the second

### andrewww commented Jun 27, 2017

 I'm sorry, I can't. I'm posting this from an environment where I can access the website, but none of the other GitHub tools.

### jnothman commented Jun 27, 2017

 This looks like it might consume some negligible extra memory but otherwise should only benefit... … On 28 Jun 2017 2:27 am, "andrewww" ***@***.***> wrote: I'm sorry, I can't. I'm posting this from an environment where I can access the website, but none of the other GitHub tools. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9234 (comment)>, or mute the thread .

### minghui-liu commented Jun 27, 2017

 I would like to help and make the changes if that's ok.

### jmschrei commented Jun 27, 2017

 Go ahead … On Tue, Jun 27, 2017 at 4:09 PM, Minghui Liu ***@***.***> wrote: I would like to help and make the changes if that's ok. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9234 (comment)>, or mute the thread .
mentioned this issue Jun 28, 2017

### ogrisel commented Sep 1, 2017

 Fixed in 6e01fef from #9236. Thanks @minghui-liu and @andrewww.
closed this Sep 1, 2017

### ogrisel commented Sep 1, 2017

 BTW I did not observe a 20x speed up a in my tests. The speed stayed approximately the same on a 1000x5 dataset generated with `sklearn.datasets.make_regression`. Note: the second call to predict is significantly faster because of the cached `K_inv`. @andrewww I would be curious to know more about the kind of data where you observed the initially reported speedup.

### andrewww commented Sep 1, 2017 • edited

 Hmmm.... This was literally a make-or-break change to the code for me, i.e., the code was so slow I could not actually use it without this change (the change to the .einsum() call). Only thing I can think of is: I'm on Windows 7 x64 / Anaconda 3.1.4. Doesn't numpy sometimes behave differently on different platforms? Maybe the Windows `einsum()` call is a lot slower than on Linux? Also, apparently it was already fixed in an earlier issue #8591

### refrigerator commented Dec 5, 2017

 Is anyone still having problems with this? I have a model trained on 15,000 data points, and predictions with `return_std=True` take about 2 minutes for a single prediction, whereas with `return_std=False` it's basically instant.