Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function defined in package does not work with data table #548

Closed
renkun-ken opened this issue Aug 17, 2014 · 6 comments
Closed

Function defined in package does not work with data table #548

renkun-ken opened this issue Aug 17, 2014 · 6 comments

Comments

@renkun-ken
Copy link

The problem occurs with both CRAN version and latest dev of dplyr.

When I'm trying to create some closures to simplify some work, I found the following errors.

Fails with closure produced in a package function

A minimal reproducible example:

library(dplyr)
library(data.table)

When the following function is evaluated in global environment, it works fine.

test <- function(data, fun) {
  function(...) {
    fun(data,...)
  }
}
> f <- test(as.data.table(mtcars), filter)
> head(f(mpg <= mean(mpg)),1)
    mpg cyl  disp  hp drat   wt  qsec vs am gear carb
1: 18.7   8 360.0 175 3.15 3.44 17.02  0  0    3    2
> f <- test(as.data.table(mtcars), mutate)
> head(f(m = mpg * 2),1)
   mpg cyl disp  hp drat   wt  qsec vs am gear carb  m
1:  21   6  160 110  3.9 2.62 16.46  0  1    4    4 42

However, if test() is defined and exported in a package, it does not work any more when data is a data.table and fun is a dplyr verb function. For example,

# in some package
#' @export
test <- function(data, fun) {
  function(...) {
    fun(data,...)
  }
}
> f <- test(as.data.table(mtcars), filter)
> head(f(mpg <= mean(mpg)))
Error in `[.data.frame`(x, i, j) : object 'mpg' not found
> f <- test(as.data.table(mtcars), mutate)
> f(m = mpg * 2)
Error in `:=`(m, mpg * 2) : 
  Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").

Fails with ordinary package function

After some testing, I found that the original problem is a bit indirect. The same problem occurs simply with ordinary function as long as it is defined in a package.

If I evaluate the following function in global environment, it works fine.

test1 <- function(data, fun, ...) {
  fun(data,...)
}
> head(test1(as.data.table(mtcars), mutate, m = mpg * 2),1)
   mpg cyl disp  hp drat   wt  qsec vs am gear carb  m
1:  21   6  160 110  3.9 2.62 16.46  0  1    4    4 42

If test1() is a package function like

#' @export
test1 <- function(data, fun, ...) {
  fun(data,...)
}

then it no longer works (same problems occur as shown in previous section)

> head(test1(as.data.table(mtcars), mutate, m = mpg * 2),1)
Error in `:=`(m, mpg * 2) : 
  Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").

My session info

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.2 dplyr_0.2.0.99   pipeR_0.4-1     

loaded via a namespace (and not attached):
[1] assertthat_0.1 magrittr_1.1.0 parallel_3.1.1 plyr_1.8.1     Rcpp_0.11.2   
[6] reshape2_1.4   stringr_0.6.2  tools_3.1.1   

See also: renkun-ken/pipeR#28.

@mattdowle
Copy link

However, if test() is defined and exported in a package, it does not work any more when data is a data.table and fun is a dplyr verb function.

Does your new package Depend or Import data.table? See :
http://stackoverflow.com/a/10529888/403310

@renkun-ken
Copy link
Author

Thanks @mattdowle! When I add data.table to Imports, it works, but it looks a bit weird because it is not the package that needs to use data.table (test() is only defined for general purpose, and data.table does not appear anywhere in the package) but the user who uses the package for some reason pass a data table to test() and process it by dplyr functions. The dependency looks quite unnatural but it works anyway.

I'll take a closer look at why this needs to be done :)

@renkun-ken
Copy link
Author

This implies that if any package that want to work with data.table (even though the package itself does not use it at all) only by user-input, the package must depend on data.table? Am I correct?

In my case, the test package defines a general purpose function test() which has nothing to do with data.table but the user may pass in a data.table to the argument. To work with it, the package must declare that it depends on data.table package even if it does not use anything from it explicitly?

@renkun-ken
Copy link
Author

@arunsrinivasan wrote:

Is it true that all packages that want to be compatible with data.table (even though they don't use it >>anywhere inside but the users might pass in some data.table) must declare dependency on it?

No, that's not the case.

I think the issue is that dplyr neither imports nor depends on data.table, rather suggests it. And, my guess is that it doesn't have .datatable.aware=TRUE set as well.

But dplyr is not necessarily using functions in data.table, and it's also weird to import or depend on it rather than suggest it. Sounds like everything must be somehow coupled?

@renkun-ken
Copy link
Author

Add .datatable.aware = TRUE to package namespace and the problem seems to be solved.

@mattdowle
Copy link

Great. If you could close the issue then please, and related ones.

yiufung referenced this issue in r-lib/pkgload Jan 6, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants