New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: cholesky_cuda: For batch 0: U(6,6) is zero, singular U. #31248
Comments
Sorry, |
I guess Cholesky is used to circumvent the computation of the inverse covariance matrix in the quadratic form... But your covariance matrix is diagonal, so maybe it is possible to use this information before constructing the probability distribution object. @LiUzHiAn, do you know whether it is possible to do so, I mean just to provide a diagonal matrix to the |
Hi, @nikitaved In my case, I assume the Then, with the The reason I use |
Ok, did not notice the batch dimension. Anyway, clearly your covariance matrix becomes singular, and Cholesky decomposition assumes that the matrix is positive-definite, and hence full-rank. This is not your case, and also the covariance matrix in your case has a simple structure, so no need to do any decompositions, you could find the (pseudo)inverse right away! So, I do not think it is a bug. Let me know whether my suggestion works, and if so, we could close the issue. |
I've written some test code and it seems that the batch dimension is supported. Ok, let's leave this issue here. Thank you so much |
No problem, just let me know whether |
Also @nikitaved , “In practice it may be necessary to add a small multiple of the identity matrix I to the covariance matrix for numerical reasons. This is because the eigenvalues of the matrix K0 can decay very rapidly and without this stabilization the Cholesky decomposition fails. The effect on the generated samples is to add additional independent noise of variance . From the context can usually be chosen to have inconsequential effects on the samples, while ensuring numerical stability.” (A.2 Gaussian Identities). Blog from which I found this - https://juanitorduz.github.io/multivariate_normal/ |
🐛 Bug when using
torch.distributions.kl_divergence(p, q)
Hi, I always get this RuntimeError during my training process:
Reproduce
There is a KL-loss term in my loss function, and I assume the two distributions are multivariate normal distributions, so I calculated it as follows:
Environment
Here is my package version:
Additional
I find my KL-loss falls in the range from 1e-6 to 1e-5.
Any ideas to solve this problem? thx
The text was updated successfully, but these errors were encountered: