Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in using aggregation functions (any,all etc) with nomatch = 0L #813

Closed
nigmastar opened this issue Sep 16, 2014 · 1 comment
Closed
Assignees
Milestone

Comments

@nigmastar
Copy link

Please consider:

dt <- data.table(id = c("a", "a", "b", "b"),
                 var = c(1.1, 2.5, 6.3, 4.5),
                 key = "id")

#1.9.3 latest revision
dt["c", list(id, check = any(var > 3)), nomatch=0L]
# Error in if (mn%%n[i] != 0) warning("Item ", i, " is of size ", n[i],  : 
#   missing value where TRUE/FALSE needed

#1.9.3 revision 1200 (last 'stable')
dt["c", list(check = any(var > 3)), nomatch=0L]
# Empty data.table (0 rows) of 2 cols: id,check

This happens only generating a brand-new column and not when simply selecting existing ones or when 'aggregation function' are not used. These

> dt["c", list(id, check = var), nomatch=0L]
Empty data.table (0 rows) of 2 cols: id,check
> dt["c", list(id, check = var > 3), nomatch=0L]
Empty data.table (0 rows) of 2 cols: id,check

evaluate as expected. This is caused from these lines of as.data.table.list function data.table.R:

    n = vapply(x, length, 0L)
    mn = max(n)
    x = copy(x)
    if (any(n<mn)) 
    for (i in which(n<mn)) {
        if (!is.null(x[[i]])) {# avoids warning when a list element is NULL
            # Implementing FR #4813 - recycle with warning when nr %% nrows[i] != 0L
            if (mn %% n[i] != 0) 
                warning("Item ", i, " is of size ", n[i], " but maximum size is ", mn, " (recycled leaving a remainder of ", mn%%n[i], " items)")
            x[[i]] = rep(x[[i]], length.out=mn)
        }
    }

I believe the following happens:

x <- as.list(dt["c", list(id, var), nomatch=0L])
x$var <- max(x$var) # to simulate the internal process
n = vapply(x, length, 0L)
mn = max(n)
if (any(n<mn)) 
  for (i in which(n<mn)) {
    if (!is.null(x[[i]])) {# avoids warning when a list element is NULL
      # Implementing FR #4813 - recycle with warning when nr %% nrows[i] != 0L
      if (mn %% n[i] != 0) # <----------------- HERE
        warning("Item ", i, " is of size ", n[i], " but maximum size is ", mn, " (recycled leaving a remainder of ", mn%%n[i], " items)")
      x[[i]] = rep(x[[i]], length.out=mn)
    }
  }
# Error in if (mn%%n[i] != 0) warning("Item ", i, " is of size ", n[i],  : 
#   missing value where TRUE/FALSE needed

1L %% 0L is NA, hence the error. In fact, if there are no zero length columns in the result it works

> dt["c", list(check = any(var > 3)), nomatch=0L]
   check
1: FALSE

Even though I believe that the result of the last expression should be Empty data.table (0 rows) of 1 col: check. If any vector has zero length, can't as.data.table.list just set the table rows to zero? I don't think there is a case when a column has zero length and i evaluates to a positive (non zero) number of rows.

Thanks,
Michele.

@arunsrinivasan
Copy link
Member

I recall another issue happening due to the same reason. Can't recall which one now. Will fix. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants