Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upNAs produced by integer overflow #8
Comments
|
@Melkiades I'm sorry for the late reply, would you mind sharing a minimal reproducible example, so that I'm in place to find out if it's a bug in the external_validation function? |
|
Yes, this is the simplest reproducible example I could make:
|
|
@Melkiades, thanks for making me aware of this issue. Actually "NAs produced by integer overflow" is a known issue in R (as mentioned in this stackoverflow thread). In the ClusterR package this was occurred when calculating the mutual_information variable. I had to use the gmp::as.bigz() function to allow the multiplication of big integers. I uploaded a new version of the ClusterR package on Github. You can install this new version using devtools::install_github('mlampros/ClusterR')as it will take some time till I submit the new version on CRAN. Please test it and let me know. |
|
That is perfect!! Thanks so much :) I have also another test (a silly one) that ends with a resulting
In Python:
|
|
@Melkiades thanks for testing this case. The authors of sklearn added an exception in case that both true-labels and clusters perfectly match. In this case the normalized_mutual_info_score should return 1.0 I also added a similar exception to account for this case. I updated the ClusterR package. Please test it and let me know. |
|
I imagined! It works like a charm. Thanks a lot for the prompt responses :) |
Hello everyone! Firstly, thank you for your wonderful work.
Secondly, I wanted to ask how is it possible to get NAs from NMI. I insert a vector of 1s and 2s (40k and 40k) as a true vector and I use 1s and 2s (20K and 60k) as the predicted vector.
The result + warning I get are: