PERF float32 propagation in GaussianMixture#30415
PERF float32 propagation in GaussianMixture#30415lesteve merged 13 commits intoscikit-learn:mainfrom
Conversation
|
@ogrisel I added a small adjustment and the float32 tests seem to pass now. Do we have any way to invoke them in the CI as well? |
Yes, by pushing a commit with the https://scikit-learn.org/stable/developers/contributing.html#commit-message-markers |
Okay nice let me see if the CI passes first before trying out float32 because for the initial pipelines the |
|
I confirm that the tests now pass locally. Let me add a changelog entry. |
|
Let me do a quick benchmark to see the impact of this change. |
|
On my Apple M1 with accelerate (
So a nice ~3X speed up and a 2X memory efficiency improvement. Benchmark script: from sklearn.mixture import GaussianMixture
from sklearn.datasets import make_blobs
import numpy as np
from time import perf_counter
X, y = make_blobs(n_samples=int(1e6), n_features=50, centers=10, random_state=0)
X = X.astype(np.float32)
tic = perf_counter()
GaussianMixture(n_components=10, covariance_type="full", random_state=0).fit(X)
toc = perf_counter()
print(f"Elapsed time: {toc - tic:.1f} s")I used the scalene profiler to report memory usage ( |
|
I also tried with OpenBLAS ( Memory usage is the same for OpenBLAS and Accelerate. |
Even then there is a nice speedup. |
|
Once this is merged, I think this estimator would be a nice candidate for array API support. |
Great work! Beyond performance enhancement, in your tests, do you obtain identical results after downcasting X to np.float32? |
The existing tests still pass when we run them with |
betatim
left a comment
There was a problem hiding this comment.
Looks reasonable to me. A few little comments.
One question: is there a unit test that checks that the dtype of gmm.predict(X) matches the dtype of X or should it match the dtype of X that was used for fitting? Or do we not need that kind of test for some reason?
|
| elif self.init_params == "random": | ||
| resp = random_state.uniform(size=(n_samples, self.n_components)) | ||
| resp /= resp.sum(axis=1)[:, np.newaxis] | ||
| resp /= resp.sum(axis=1)[:, np.newaxis].astype(X.dtype) |
There was a problem hiding this comment.
I am not sure it matters that much, but it seems like resp is always going to be float64 (apparently .uniform does not let you specify a dtype). In particular the .astype(X.dtype) is not going to affect resp.dtype?
There was a problem hiding this comment.
Indeed! TIL. If resp.dtype is not float32 then dividing by a float32 won't change the dtype of it :-/
I guess we'd have to add a resp = resp.astype(X.dtype) and maybe can remove the astype from resp /= resp.sum(axis=1)[:, np.newaxis].astype(X.dtype)?
There was a problem hiding this comment.
Yes this was a valid point. Thanks for catching it @lesteve
I updated this and also added global_dtype in a test that uses this particular initialization.
|
Thanks, @lesteve and @OmarManzoor for the latest round of fixes / testing. |
| resp = random_state.uniform(size=(n_samples, self.n_components)).astype( | ||
| X.dtype | ||
| ) |
There was a problem hiding this comment.
I am not sure this makes a big difference but we may as well avoid a copy if X.dtype is float64 (which I guess is the most common use case)?
| resp = random_state.uniform(size=(n_samples, self.n_components)).astype( | |
| X.dtype | |
| ) | |
| resp = np.asarray( | |
| random_state.uniform(size=(n_samples, self.n_components)), dtype=X.dtype | |
| ) |
lesteve
left a comment
There was a problem hiding this comment.
Thanks, I have enabled auto-merge!
| dtype = self.precisions_cholesky_.dtype | ||
| if self.covariance_type == "full": | ||
| self.precisions_ = np.empty(self.precisions_cholesky_.shape) | ||
| self.precisions_ = np.empty_like(self.precisions_cholesky_) |
This is a draft PR to explore how much work is need to ensure floating-point precision propagation in
GaussianMixtureas part of the discussion in #30382.There are still some failing tests to fix.