Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation error in GaussianMixture #10141

Open
rdturnermtl opened this issue Nov 14, 2017 · 9 comments
Open

Documentation error in GaussianMixture #10141

rdturnermtl opened this issue Nov 14, 2017 · 9 comments

Comments

@rdturnermtl
Copy link

In the documentation of sklearn.mixture.GaussianMixture, it says that precisions_cholesky_ is:
"The cholesky decomposition of the precision matrices of each mixture component."
However in lines 317--321 of sklearn.mixture.gaussian_mixture.py it is clearly computing the inverse of the cholesky rather than the cholesky of the inverse. These are not the same:

import numpy as np

np.random.seed(0)
S = np.random.randn(5, 5)
S = np.dot(S, S.T)
IC = np.linalg.inv(np.linalg.cholesky(S).T)
CI = np.linalg.cholesky(np.linalg.inv(S)).T

print IC
print CI

gives

[[ 0.2801735   0.05324512 -0.24035224  0.15420992  0.49081071]
 [ 0.          0.70922139 -0.67751324  0.23295284 -6.89597681]
 [ 0.          0.          0.89988635 -0.67437444 -3.85334538]
 [ 0.          0.          0.          0.72994649  6.24915248]
 [ 0.          0.          0.          0.          3.76407547]]
[[ 0.63543472 -4.9542323  -3.48037026  5.00400095  2.90737741]
 [ 0.          4.90166957  1.74700377 -3.69934856 -2.35698345]
 [ 0.          0.          0.97357464 -0.71268093 -0.27514477]
 [ 0.          0.          0.          0.5929664   0.09843469]
 [ 0.          0.          0.          0.          0.27323201]]

I think the code is correct, but the documentation needs to be corrected. The same issue applies to BayesianGaussianMixture.

@amueller
Copy link
Member

My matrix algebra seem very rusty :-/

np.linalg.inv(S)
np.dot(IC, IC.T)
np.dot(CI.T, CI)

all produce the same result, which makes me think there's just some transpose somewhere?

@amueller
Copy link
Member

Ah, never mind. IC is not a cholesky because it's upper triangular times lower triangular. So yes, seems like an issue with the docs.

@jnothman
Copy link
Member

Please feel free to offer a PR with a change in the docs.

@FarahSaeed
Copy link
Contributor

Hi In _initialize method, it's computing cholesky of precision matrices in lines 644 - 647 of gaussian_mixture.py, (which I think is cholesky of inverse). Isn't it different from the implementation in _compute_precision_cholesky which takes inverse of cholesky?

@lesteve
Copy link
Member

lesteve commented Nov 16, 2017

@FarahSaeed friendly advice: for clarity's sake link to code like this.

In your particular example, I guess you mean

elif self.covariance_type == 'full':
self.precisions_cholesky_ = np.array(
[linalg.cholesky(prec_init, lower=True)
for prec_init in self.precisions_init])
?

@FarahSaeed
Copy link
Contributor

@lesteve yeah exactly.

@rdturnermtl
Copy link
Author

Yeah that definitely looks like an inconsistency in the code. Maybe it would make more sense to take the Cholesky than the precision for the initialization.

Is there a good way to fix this without creating backwards compatibility issues?

@aby0
Copy link
Contributor

aby0 commented Dec 5, 2017

@jnothman As far as my understanding, we generally use cholesky decomposition for calculating inverse of positive definite matrix efficiently(better than LU decomposition), so taking cholesky of covariance makes sense in order to calculate precision matrix eventually. So, It might be just an documentation error. If so, can I move ahead to make changes in documentation? Thanks

@jnothman
Copy link
Member

jnothman commented Dec 6, 2017

This isn't my expertise, @aby0, but submit a PR and the change will certainly be considered!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants