Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Windows only: problem with mutual_info_classif when discrete=True #9772
RuntimeWarning: invalid value encountered in log
Steps/Code to Reproduce
I'm trying to compute mutual information between two discrete variables, one takes values between 1 and 5, the other 0 and 1.
I tried again but i still got this:
@lesteve I am experiencing this error in Ubuntu system. Also, after fixing a probable integer overflow, I am getting
The problem lies at line 605, sklearn/metrics/cluster/supervised.py
outer = pi.take(nzx) * pj.take(nzy) if np.any(outer<0): outer = pi.take(nzx).astype(np.int64) * pj.take(nzy).astype(np.int64)
and thus got
I am guessing this is because you are using a 32-bit python. I can reproduce the problem using a 32-bit python.
This is what I get as well. Not sure why I had a different value in my previous post.
Seems like you figured out where the int overflow happens, thanks! Not sure what the best fix actually is, maybe casting
As for testing, you are more than welcome to add a test similar to the one in the first post (without the pandas dependency). This should fail on Windows without your fix.