Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix covariance initialization when matrix is not invertible #277

Merged
merged 17 commits into from Feb 4, 2020

Conversation

grudloff
Copy link
Contributor

@grudloff grudloff commented Jan 24, 2020

This PR fixes #276, an issue that arises in the context of covariance initialization on an algorithm that doesn't require a PD matrix. It can happen that the calculated matrix is singular and in consequence isn't invertible, leading to non-explicative warnings. This is fixed by the use of a pseudo-inverse when a PD matrix is not required.

@bellet
Copy link
Member

bellet commented Jan 24, 2020

Thanks! Could you briefly describe the fix in the PR text, and confirm that it solves the broken piece of code in the issue?

raise LinAlgError("Unable to get a true inverse of the covariance "
"matrix since it is not definite. Try another "
"`{}`, or an algorithm that does not "
"require the `{}` to be strictly positive definite."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could also add a third option for the user here: to reduce the input to avoid linear dependencies, as stated in the description of #276 ? Similarly to

warnings.warn('The inner covariance matrix is not invertible, '
'so the transformation matrix may contain Nan values. '
'You should remove any linearly dependent features and/or '
'reduce the dimensionality of your input, '
'for instance using `sklearn.decomposition.PCA` as a '
'preprocessing step.')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I totally agree!

@wdevazelhes
Copy link
Member

For the test, maybe you can extend/take inspiration from

def test_singular_covariance_init_or_prior(estimator, build_dataset):
(see comment #276 (comment))

.format(*((matrix_name,) * 2)))
M = np.dot(u / s, u.T)
if strict_pd:
s, u = scipy.linalg.eigh(M_inv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simply import eigh instead of the whole scipy library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes sure, did it that way just because it was done previously like that.

if strict_pd:
s, u = scipy.linalg.eigh(M_inv)
cov_is_definite = _check_sdp_from_eigen(s)
if not cov_is_definite:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test case to exercise the issue you are trying to fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course! I will build a test based on the suggestion of @wdevazelhes

@grudloff
Copy link
Contributor Author

grudloff commented Jan 24, 2020

Looking at _utils.py found that the same issue can arise from custom matrix initialization init=M as there is no check for invertibility. Probably can be handled with the warning @wdevazelhes suggested.

@bellet
Copy link
Member

bellet commented Jan 24, 2020

Looking at _utils.py found that the same issue can arise from custom matrix initialization init=M as there is no check for invertibility. Probably can be handled with the warning @wdevazelhes suggested.

Good catch. We should return the pseudo-inverse there as well when return_inverse=True

@grudloff
Copy link
Contributor Author

Looking at _utils.py found that the same issue can arise from custom matrix initialization init=M as there is no check for invertibility. Probably can be handled with the warning @wdevazelhes suggested.

Good catch. We should return the pseudo-inverse there as well when return_inverse=True

Since we are not going to need the eigenvalue & eigenvectors to know positive definitiveness, would it be a good idea to replace it by the Cholesky Factorization which is a more efficient way to do so?

@bellet
Copy link
Member

bellet commented Jan 27, 2020

Note that since we always compute M = np.dot(u / s, u.T) we could reuse u and s to compute the pseudo inverse directly instead of wasting this work by calling pinvh(M)

@grudloff
Copy link
Contributor Author

Note that since we always compute M = np.dot(u / s, u.T) we could reuse u and s to compute the pseudo inverse directly instead of wasting this work by calling pinvh(M)

You are right! I can compute the SVD and with that I can check the eigenvalues and also calculate the pseudo inverse. I will implement it that way then.

@grudloff
Copy link
Contributor Author

grudloff commented Jan 27, 2020

This last commit broke two tests, but I suspect they are not well set as they ask for a nonPSDError from MMC and MMC_Supervised which shouldn't throw one as MMC doesn't ask for PSD.

Edit: For some reason, there seem to be many more errors than what I got on my local repo.
Edit2: Looks like it is using an outdated version of numpy, is there any way to confirm this?

@bellet
Copy link
Member

bellet commented Jan 28, 2020

Note that since we always compute M = np.dot(u / s, u.T) we could reuse u and s to compute the pseudo inverse directly instead of wasting this work by calling pinvh(M)

You are right! I can compute the SVD and with that I can check the eigenvalues and also calculate the pseudo inverse. I will implement it that way then.

Why switch from eigh to svd? You should keep eigh (which will throw an error if the matrix is not symmetric) and the eigenvalue check, and simply replace the call to pinv by the computation of the pseudo-inverse from s and u

@bellet
Copy link
Member

bellet commented Jan 28, 2020

This last commit broke two tests, but I suspect they are not well set as they ask for a nonPSDError from MMC and MMC_Supervised which shouldn't throw one as MMC doesn't ask for PSD.

What do you mean? MMC does require PSD matrices (like any Mahalanobis metric learning algorithm). Maybe you meant it does not require strictly PD?

@grudloff
Copy link
Contributor Author

grudloff commented Jan 28, 2020

Note that since we always compute M = np.dot(u / s, u.T) we could reuse u and s to compute the pseudo-inverse directly instead of wasting this work by calling pinvh(M)

You are right! I can compute the SVD and with that, I can check the eigenvalues and also calculate the pseudo inverse. I will implement it that way then.

Why switch from eigh to svd? You should keep eigh (which will throw an error if the matrix is not symmetric) and the eigenvalue check, and simply replace the call to pinv by the computation of the pseudo-inverse from s and u

Is it possible to get the pseudoinverse from the eigenvalue decomposition? I looked at the implementation of pinv and it was done with svd, and since it also outputs the (square roots of the) eigenvalues one can also check for positive definiteness.

@grudloff
Copy link
Contributor Author

This last commit broke two tests, but I suspect they are not well set as they ask for a nonPSDError from MMC and MMC_Supervised which shouldn't throw one as MMC doesn't ask for PSD.

What do you mean? MMC does require PSD matrices (like any Mahalanobis metric learning algorithm). Maybe you meant it does not require strictly PD?

My bad! That is what I meant. But I was wrong as the error raises when there are negative eigenvalues. I suspect the problem should be in the tolerance of _check_sdp_from_eigen as I am using the root squared eigenvalues from the svd, maybe I should modify it. The failed tests are the following:

FAILED test/test_mahalanobis_mixin.py::test_init_mahalanobis[MMC] - Failed: DID NOT RAISE <class 'metric_learn.exceptions.NonPSDError'>
FAILED test/test_mahalanobis_mixin.py::test_init_mahalanobis[MMC_Supervised] - Failed: DID NOT RAISE <class 'metric_learn.exceptions.NonPSDError'>

Any idea about the other failed test? Seems like an outdated version of numpy is being used.

@grudloff
Copy link
Contributor Author

grudloff commented Jan 28, 2020

SVD loses the sign information from the eigenvalues so PSD cannot be checked with them, this clearly explains the failed tests. I think I will have to roll back the last commit then. Sorry for the confusion!

@bellet
Copy link
Member

bellet commented Jan 28, 2020

Is it possible to get the pseudoinverse from the eigenvalue decomposition? I looked at the implementation of pinv and it was done with svd, and since it also outputs the (square roots of the) eigenvalues one can also check for positive definiteness.

Yes since the SVD is the same as the eigenvalue decomposition for a symmetric matrix, see for instance
https://math.stackexchange.com/questions/86568/inverse-of-a-symmetric-positive-semi-definite-matrix
Switching back to eigh and computing the pseudo-inverse from its output should fix the failures you are getting

@grudloff
Copy link
Contributor Author

Is it possible to get the pseudoinverse from the eigenvalue decomposition? I looked at the implementation of pinv and it was done with svd, and since it also outputs the (square roots of the) eigenvalues one can also check for positive definiteness.

Yes since the SVD is the same as the eigenvalue decomposition for a symmetric matrix. Switching back to eigh and computing the pseudo-inverse from should fix the failures you are getting

Yes, thank you! I was just looking at that.

@bellet
Copy link
Member

bellet commented Jan 28, 2020

For computing the pseudo-inverse, you should simply follow what is done in pinvh (which actually calls eigh), including the tolerance strategy:
https://github.com/scipy/scipy/blob/v1.4.1/scipy/linalg/basic.py#L1394-L1470

if strict_pd and not init_is_definite:
raise LinAlgError("You should provide a strictly positive definite "
"matrix as `{}`. This one is not definite. Try another"
" {}, or an algorithm that does not "
"require the {} to be strictly positive definite."
.format(*((matrix_name,) * 3)))
elif return_inverse and not init_is_definite:
warnings.warn('The initialization matrix is not invertible, '
'but this isn"t an issue as the pseudo-inverse is used.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the second part maybe simply "using the pseudo-inverse instead"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I was looking for a better way to phrase that.

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things look good - you mainly need to write tests and we're good to go. You should definitely write at least one test for your _pseudo_inverse_from_eig function, and tests for the non-invertible cases.

Also it would be nice to check whether we have one for when the initialized matrix is not symmetric. According to https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eigh.html :
"If eigenvalue computation does not converge, an error occurred, or b matrix is not definite positive. Note that if input matrices are not symmetric or hermitian, no error is reported but results will be wrong."

So it is not clear what happens for us and whether we correctly deal with this

s, u = scipy.linalg.eigh(init)
init_is_definite = _check_sdp_from_eigen(s)
if isinstance(M, np.ndarray):
w, V = eigh(M, check_finite=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you turn off check_finite? I think it is better to keep it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_array() does it previously, so it shouldn't be necessary to do it again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed!

if strict_pd and not init_is_definite:
raise LinAlgError("You should provide a strictly positive definite "
"matrix as `{}`. This one is not definite. Try another"
" {}, or an algorithm that does not "
"require the {} to be strictly positive definite."
.format(*((matrix_name,) * 3)))
elif return_inverse and not init_is_definite:
warnings.warn('The initialization matrix is not invertible, '
'but this isn"t an issue as the pseudo-inverse is used.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe simply "The initialization matrix is not invertible: using the pseudo-inverse instead"

M = np.dot(u / s, u.T)
elif not cov_is_definite:
warnings.warn('The inverse covariance matrix is not invertible, '
'but this isn"t an issue as the pseudo-inverse is used. '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

elif not cov_is_definite:
warnings.warn('The inverse covariance matrix is not invertible, '
'but this isn"t an issue as the pseudo-inverse is used. '
'You can remove any linearly dependent features and/or '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for clarity, maybe start this sentence by "To make the inverse covariance matrix invertible, "

@grudloff
Copy link
Contributor Author

correctly

Things look good - you mainly need to write tests and we're good to go. You should definitely write at least one test for your _pseudo_inverse_from_eig function, and tests for the non-invertible cases.

Also it would be nice to check whether we have one for when the initialized matrix is not symmetric. According to https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eigh.html :
"If eigenvalue computation does not converge, an error occurred, or b matrix is not definite positive. Note that if input matrices are not symmetric or hermitian, no error is reported but results will be wrong."

So it is not clear what happens for us and whether we correctly deal with this

Great! I am making the tests now, should be done soon.

I don't think there should be an issue with non-symmetric matrices as this is checked for the array init case and the covariance matrix should always be symmetric. I think there is a missing test for the symmetry checking of array init that should be implemented.

@bellet
Copy link
Member

bellet commented Jan 29, 2020

Indeed, I had missed the symmetry check for init - nevermind! But indeed if we are missing a test for this it would be nice to add it

@bellet
Copy link
Member

bellet commented Jan 29, 2020

@grudloff
Copy link
Contributor Author

(but in seems that we are not: https://codecov.io/gh/scikit-learn-contrib/metric-learn/src/master/metric_learn/_util.py#L671)

Great! With the other tests this should be ready then.

warnings.warn('The inverse covariance matrix is not invertible, '
'but this isn"t an issue as the pseudo-inverse is used. '
'You can remove any linearly dependent features and/or '
warnings.warn('The initialization matrix is not invertible: '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialization matrix --> covariance matrix

'You can remove any linearly dependent features and/or '
warnings.warn('The initialization matrix is not invertible: '
'using the pseudo-inverse instead.'
'To make the inverse covariance matrix invertible'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the inverse covariance --> the covariance

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the above, it would be be good to write a test for _pseudo_inverse_from_eig to verify that it returns the same as when calling pinvh on the original matrix.

"""Tests that when using the 'covariance' init or prior, it returns the
appropriate warning if the covariance matrix is singular, for algorithms
that don't need a strictly PD init. Also checks that the returned
inverse matrix has finite values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what you check, right? You seem to check the learned M.
Any reason you do this and not a call to _initialize_metric_mahalanobis like in the other test? Maybe you could check both things in both tests?

Copy link
Contributor Author

@grudloff grudloff Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are totally right! Forgot to set the max iteration in order to make it stop before the first iteration.

I called _initialize_metric_mahalanobis directly in the other test because currently there is no algorithm that could reproduce that setting. But setting max_iter = 0 should keep it from iterating and would result in the M from the initialization.

Edit: It isn't possible to set max_iter = 0, so instead I will do it like you sugested.

Comment on lines +704 to +709
X, y = shuffle(*make_blobs(random_state=rng),
random_state=rng)
P = ortho_group.rvs(X.shape[1], random_state=rng)
w = np.abs(rng.randn(X.shape[1]))
w[0] = w0
M = P.dot(np.diag(w)).dot(P.T)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a bit convoluted just to generated a singular matrix? You could just draw a rectangular matrix L and construct M as L.T.dot(L)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, just did it that way because it was done like that on test_singular_array_init_or_prior_strictpd. But also this way you have only one zero eigenvalue and control its value directly with the parametrization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the random seed thing, LGTM!

def test_pseudo_inverse_from_eig_and_pinvh_singular(w0):
"""Checks that _pseudo_inverse_from_eig return the same result as
scipy.linalg.pinvh for a singular matrix"""
A = np.random.rand(100, 100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should fix the random seed

def test_pseudo_inverse_from_eig_and_pinvh_nonsingular():
"""Checks that _pseudo_inverse_from_eig return the same result as
scipy.linalg.pinvh for a non singular matrix"""
A = np.random.rand(100, 100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Member

@wdevazelhes wdevazelhes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work @grudloff !

@grudloff
Copy link
Contributor Author

grudloff commented Feb 4, 2020

LGTM, nice work @grudloff !

Thanks!

@bellet
Copy link
Member

bellet commented Feb 4, 2020

Nitpick: calling np.random.seed modifies the global random state. To avoid affecting other parts of the code I would say it is safer to create a separate pseudo-random number generator:

rng = np.random.RandomState(SEED)
A = rng.rand(100, 100)

@bellet
Copy link
Member

bellet commented Feb 4, 2020

Thanks @grudloff ! Merging

@bellet bellet merged commit e739239 into scikit-learn-contrib:master Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error on covariance initialization when there are linearly dependent dimensions
4 participants