-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoiding zeros in NMF β loss calculation causes inconsistency #25438
Comments
Searching the file for additional occurrences, a similar situation appears in 9 others situations throughout this file, mostly in the context of β loss calculation, but also generally to avoid devision by zero. |
Thanks for the report. Changing to Here, |
I can submit a pull request, any specifics i need for the process? Regarding the ε size, specifically in NMF the WH matrix is the multiplication of two matrices W & H, and it might be the values in the ε range of either one is important, and when they are multiplied the result is going to be in the ε^2 range. I don't think that its critical though. |
Shouldn't |
Great question @glevv - I'm not qualified to address it. I can say that purely speaking, if the support of one distribution doesn't match that of the other (i.e. one has zero where the other is non-zero), strictly speaking their KL divergence should be infinity. |
It's just the matter of reweighing the probabilities
Higher epsilon (like 1e-3) will give very high discrepancy between proposed and straightforward method, while reweighed still will be close enough (3rd digit after floating point). With lower epsilon (like 1e-15) all the methods will "converge" to the same value. If we reverse values (kl_div(probs_y, probs) ) we will get
And second method will give higher number (but not infinity) than 3rd with any epsilon, which is what is needed. |
Allright, so how do you suggest I change the pull request (one or a combination of the following)? |
I think it's up to core team to decide how to move forward with this issue |
Should this one be closed? |
scikit-learn/sklearn/decomposition/_nmf.py
Line 143 in 3f82f84
The line
WH_data[WH_data == 0] = ΕPSILON
as part of the β loss calculation creates a situation where some entries in the data that were larger than other (say 1E-09 was larger than zero) are now smaller than the other.I think the code should be corrected to
WH_data[WH_data < EPSILON] = ΕPSILON
to avoid such inconsistencies.
Another issue is preventing overflows in the devision that happens a few lines later - if the data has the smallest positive number (around 10^-38), dividing a number larger than 1 by it causes the overflow.
The text was updated successfully, but these errors were encountered: