Skip to content

Smoothing become zero when all the associated documents are empty #6

@koheiw

Description

@koheiw

When colSums(z) = s is all zero, then the mean frequency mean(s) = a = 0. a <- max(mean(s), 1 / length(s)) will avoid this.

wordmap/R/textmodel.R

Lines 150 to 170 in d832702

} else {
weight <- matrix(rep(1, ncol(w) * ncol(y)), ncol = ncol(w), nrow = ncol(y),
dimnames = list(colnames(y), colnames(w)))
}
m <- colSums(w)
for (key in sort(featnames(y))) {
if (verbose) cat(sprintf(' label = "%s"\n', key))
z <- w[as.logical(y[,key] > 0),]
s <- colSums(z)
if (old) {
v0 <- m - s + smooth
v1 <- s + smooth
} else {
if (smooth >= 1.0)
warning("The value of smooth became fractional in wordmap v0.92")
a <- mean(s)
v0 <- m - s + (a * smooth)
v1 <- s + (a * smooth)
}
model[key,] <- log(v1 / sum(v1)) - log(v0 / sum(v0)) # log-likelihood ratio

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions