-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: speed up multivariate_normal.logpdf for nonsingular matrices #9973
Conversation
The typical response to such determinant and rank computations is that Cholesky factorization is not a rank-revealing factorization hence cannot be used reliably for near-singular matrices. Often cholesky is a pre-step for positive definiteness "if it passes nice but otherwise do the expensive way". Because the expensive way is way more expensive than computing the Cholesky factorization it doesn't harm too much but if exists saves quite a bit. Hence instead of "if, else" block we need a "try, except" (I mean conceptually, it doesn't have to live in try, except necessarily). |
Apologies for not going back to this work lately. @ilayn can you suggest a good way to check whether a matrix is not near-singular given its Cholesky decomposition? I have found some relevant information in this presentation starting from slide 22, but I'm not sure if I want to go all the way into implementing the algorithm they suggest. In other words, what should trigger the |
@ilayn could you elaborate on how to detect near-singular matrices using Cholesky decomposition? |
Ouch, my apologies for the ridiculous delay. What I meant (I hope) is that Cholesky decomposition test is used as a canary for positive definiteness, if it succeeds then nice we carry on. If it fails either the matrix is not positive definite (which is on paper not possible since it is covariance) or precision-related failures. If you do this test inside a try except block you can catch it and fall back to eigh or something else. |
just a side comment
I have several cases in statsmodels where I work with singular, reduced rank covariance matrices, that are not directly estimated from data. |
I'm not sure that it @josef-pkt also rightfully notes that the covariance matrix might be singular in practice (e.g. a 1D Gaussian on a 2D plane). So is there a good way to detect if the matrix is non-singular given the output of |
Yes, so if that is such a problem then you can fallback to whichever branch you want. We switch to a more expensive method like eigh or svd and bite the bullet or otherwise Cholesky should go through without any performance penalty. But shouldn't get stuck at Cholesky step. You can check the |
Hi @dniku just thought I'd check in to see if you planned to finish this up? What do you need from us? |
This had been inactive for a while, and it's out of date, so I think it's safe to close. This is a good idea, though, add we'll take care of this in a follow-up to gh-16002. See mdhaber#88 for how this might work now. |
NOTE: this is a draft PR. There is a potential critical issue with precision, no tests, no documentation, and no benchmarks.
This PR changes the behavior of the following functions when
allow_singular=False
:multivariate_normal.logpdf
multivariate_normal.pdf
multivariate_normal_frozen.logpdf
multivariate_normal_frozen.pdf
With these changes, I observed a 35x speedup on average. A Jupyter notebook with benchmarks can be found here (executed on Dell XPS 15 9560). The central idea here is to replace
eigh
with the Cholesky decomposition. The code (as submitted in this PR) is equivalent to the following:It is very likely that the current implementation has an issue with precision. In the notebook, it can be seen that at
D = 385
I generated a vectormean
and a matrixcov
such that the result computed by my implementation was significantly different from the one computed by the current version in Scipy (detected bynp.allclose
). A dump of the corresponding arrays can be found here (withm
formean
andS
forcov
); I have not investigated yet what causes this issue.TODO:
I am submitting this PR in raw state to get feedback on whether this approach is worth pursuing at all, or it is something that has known problems and thus cannot be included in Scipy.
Relevant: