New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX better handle limit cases in normalized_mutual_info_score #22635
FIX better handle limit cases in normalized_mutual_info_score #22635
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
With the explicit handling of constant clusters. I think we can remove:
scikit-learn/sklearn/metrics/cluster/_supervised.py
Lines 1038 to 1039 in 9ced5ec
# Avoid 0.0 / 0.0 when either entropy is zero. | |
normalizer = max(normalizer, np.finfo("float64").eps) |
I agree. Now the only way to have an entropy < eps would be to have a single one in an array of approx 10^17 zeros which is a big unrealistic array :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @jeremiedbb |
Fixes #13836
by better handling of limit cases, i.e when 1 or both labelling are constant, i.e. there's a single cluster.