-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Error computing multivariate_normal.logpdf with singular cov matrix #15509
Comments
@IgnacioHeredia Here is the relevant section of code: scipy/scipy/stats/_multivariate.py Lines 464 to 466 in 5f4ba67
In your example, only the last of these is different numerically, and it just comes down to the fact that I don't have time to look into which is correct right now. Do you know? Here is a simplified version of your example, modified to agree with SciPy: import numpy as np
from scipy.stats import multivariate_normal
sample = np.array([1, 1])
mean = np.array([0, 0])
cov = np.array([[2, 6],
[6, 18]])
def manual_logpdf(sample):
# ref: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Likelihood_function
eig_values = np.linalg.eig(cov)[0]
eig_values = eig_values[eig_values > 1e-12]
log_det = np.log(np.product(eig_values))
inv = np.linalg.pinv(cov)
residuals = sample - mean
k = np.linalg.matrix_rank(cov) # len(mean)?
return -0.5 * (k * np.log(2 * np.pi)
+ log_det
+ residuals.T.dot(inv).dot(residuals))
x = multivariate_normal.logpdf(x=sample, mean=mean, cov=cov,
allow_singular=True)
x2 = manual_logpdf(sample)
print(x, x2) # -2.4568046699816684 -2.4568046699816684 |
Hi @mdhaber, thanks for taking a look! The Wikipedia page from where I took the logpdf formula says that |
I think you are actually right @mdhaber. I went to one of the sources listed in the Wikipedia page and I found this (p. 528):
The Wikipedia page notation is indeed misleading. The book clearly distinguishes the length I guess Scipy's implementation is correct after all! I'll try to fix submit fixes to both wikipedia and the Thanks for all! |
I think SciPy is right because then the
Thinking about it from a different perspective, I might also have expected the pdf to be zero for a vector not aligned with |
Describe your issue.
Hi everyone,
I noticed a possible error when implementing the logpdf of a multivariate normal with singular covariances matrices. The scipy implementation gives different results as a manual implementation (formula here). I attach a minimally reproducible example.
The output is:
At this point my usual approach would be to not trust my code, but it happens that using R I get the same results as my manual implementation in Python so I'm a bit clueless on what
multivariate_normal.logpdf
is actually doing in the case of singular matrices. The R code is usingdmvnorm
from the mvnorm package.I also opened a (posibly related?) issue regarding
multivariate_normal
at #15508.Thanks!
Ignacio
Reproducing Code Example
Error message
SciPy/NumPy/Python version information
1.6.2
The text was updated successfully, but these errors were encountered: