Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GMM fails without LAPACK #4210

Open
vigsterkr opened this issue Mar 19, 2018 · 14 comments
Open

GMM fails without LAPACK #4210

vigsterkr opened this issue Mar 19, 2018 · 14 comments

Comments

@vigsterkr
Copy link
Member

it seems not having LAPACK backend for eigen some of our tests are broken (see #4204), one of which is GMM.

@vinx13
Copy link
Member

vinx13 commented May 28, 2018

Have checked locally, LinalgBackendEigen::eigen_solver_symmetric_impl returns negated eigen vectors when LAPACK backend is used

@vigsterkr
Copy link
Member Author

@vinx13 great! thnx for checking this! so yeah then our unit test for GMM is not invariant for such case... which it should be as they are actually the same eigenvector

@karlnapf
Copy link
Member

in #4204 the integration tests fail (i.e. the mean of the mixture component), not the unit test.

I am not sure whether the cluster centres might be permuted if the sign changes, or if the results are simply different. Did anyone check whether some unit tests fail as well?

@vigsterkr
Copy link
Member Author

@karlnapf unit test? oh yeah mistyping... meant integration... and the whole solution is fsck-ed there... so that integration test with different EVs ends up having all different params...

@vigsterkr
Copy link
Member Author

different EVs = opposite direction

@karlnapf
Copy link
Member

Ill dig into the GMM code and see whether we can make it invariant to the sign in GMM itself

@karlnapf
Copy link
Member

Maybe it would make sense to do some postprocessing after the backend call.
We can enforce all eigenvalues have positive sign for example (moving the minus into eigenvector, or just drop if both are negative). Thoughs?

@vigsterkr
Copy link
Member Author

vigsterkr commented May 30, 2018

---- i.e. dontlike

@vigsterkr
Copy link
Member Author

fyi here's the generated output when LAPACK isn't available

<<_SHOGUN_SERIALIZABLE_ASCII_FILE_V_00_>>
array Vector<SGSerializable*> 7 ({Serializable int32 [
value int32 3
]}{VectorSerializable float64 [
value SGVector<float64> 2 ({-2.135073443612682}{3.141044590471833})
]}{Serializable int32 [
value int32 1
]}{VectorSerializable float64 [
value SGVector<float64> 2 ({0.7827734860786125}{-1.76419104479616})
]}{MatrixSerializable float64 [
value SGMatrix<float64> 2 2 ({3.392643915170048}{0.6720138121871488}{0.6720138121871488}{1.244541368311431})
]}{VectorSerializable float64 [
value SGVector<float64> 3 ({0.1969275849684337}{0.533330685448861}{0.2697417295827056})
]}{VectorSerializable float64 [
value SGVector<float64> 4 ({-16.95532013642032}{-3.130118784821983}{-31.06361716282668}{-3.130117794465724})
]})
resize_granularity int32 128
use_sg_malloc bool t
free_array bool t
dim1_size int32 1
dim2_size int32 1
dim3_size int32 1

@karlnapf
Copy link
Member

Alternative is to do that in the GMM .... and all the other algorithms where such problems appear

@vigsterkr
Copy link
Member Author

@karlnapf even more so ----

@karlnapf
Copy link
Member

in this case ... good luck solving it :)

@vigsterkr
Copy link
Member Author

in this case imo this shows how some of our integration tests are actually not the best. :)

@vigsterkr
Copy link
Member Author

same story comes up with the randoms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants