New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSD() gives wrong value when some element == 0 #4
Comments
Hi @wkc1986, Thank you so so much for pointing out this issue and I will immediately have a look at it. I will start debugging now and will come back to you soon. Please feel free to send me any other issue you might find. Many thanks and best wishes, |
Hi @wkc1986, I found the bug and fixed it. The problem was the following: In the initial if((P[i] == 0.0) || (Q[i] == 0.0)){
if (P[i] == 0.0)
sum1 += 0.0;
if (Q[i] == 0.0)
sum2 += 0.0;
} else {
PQsum = P[i] + Q[i];
if (unit == "log"){
sum1 += P[i] * log((2.0 * P[i]) / PQsum);
sum2 += Q[i] * log((2.0 * Q[i]) / PQsum);
}
else if (unit == "log2"){
sum1 += P[i] * custom_log2((2.0 * P[i]) / PQsum);
sum2 += Q[i] * custom_log2((2.0 * Q[i]) / PQsum);
}
else if (unit == "log10"){
sum1 += P[i] * custom_log10((2.0 * P[i]) / PQsum);
sum2 += Q[i] * custom_log10((2.0 * Q[i]) / PQsum);
} else {
Rcpp::stop("Please choose from units: log, log2, or log10.");
}
}
} The problem was that I only consider the case I now consider all cases where PQsum = P[i] + Q[i];
if (unit == "log"){
if (P[i] == 0.0 || PQsum == 0.0) {
sum1 += 0.0;
} else {
sum1 += P[i] * log((2.0 * P[i]) / PQsum);
}
if (Q[i] == 0.0 || PQsum == 0.0) {
sum2 += 0.0;
} else {
sum2 += Q[i] * log((2.0 * Q[i]) / PQsum);
}
}
else if (unit == "log2"){
if (P[i] == 0.0 || PQsum == 0.0) {
sum1 += 0.0;
} else {
sum1 += P[i] * custom_log2((2.0 * P[i]) / PQsum);
}
if (Q[i] == 0.0 || PQsum == 0.0) {
sum2 += 0.0;
} else {
sum2 += Q[i] * custom_log2((2.0 * Q[i]) / PQsum);
}
}
else if (unit == "log10"){
if (P[i] == 0.0 || PQsum == 0.0) {
sum1 += 0.0;
} else {
sum1 += P[i] * custom_log10((2.0 * P[i]) / PQsum);
}
if (Q[i] == 0.0 || PQsum == 0.0) {
sum2 += 0.0;
} else {
sum2 += Q[i] * custom_log10((2.0 * Q[i]) / PQsum);
}
} else {
Rcpp::stop("Please choose from units: log, log2, or log10.");
} When running your initial example I now get all the same values: test.pairwise.jsd <- function(x, fn = "log") {
h <- function(p, fn = "log") {
pilogpi <- sapply(p, function(pi, fn) {
if (pi == 0) {
pi
} else {
pi * do.call(fn, list(x = pi))
}
}, fn = fn)
-1 * sum(pilogpi)
}
mean.p <- apply(x, 2, mean)
# browser()
h(mean.p, fn) - mean(apply(x, 1, h, fn = fn))
}
xx <- rbind(c(1,0), c(0.5,0.5))
d1 <- test.pairwise.jsd(xx, "log2")
library(philentropy)
d2 <- JSD(xx)
d3 <- H(apply(xx, 2, mean)) - mean(apply(xx, 1, H))
d4 <- gJSD(t(xx))
d1
Thanks to your excellent example I also unraveled that the
I will now extend the unit tests and write more extensive test for Once more many many thanks for making me aware of this issue, I truly appreciate it :)))) Best wishes, |
Glad to be of help! Your package is the fastest way I know of to calculate JSD matrices and generalized JSD in R, so I'm happy to see it ticking. |
Sorry to be a bother again. I'm using the GitHub version. Comparing against a test JSD function in base R, as well as calculations using
H()
andgJSD()
, onlyJSD()
gives a different result:The text was updated successfully, but these errors were encountered: