Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure when using a formula in the global environment #1012

Closed
bart1 opened this issue Sep 13, 2019 · 2 comments
Closed

Failure when using a formula in the global environment #1012

bart1 opened this issue Sep 13, 2019 · 2 comments

Comments

@bart1
Copy link

bart1 commented Sep 13, 2019

I just encountered the following case. The calculation of md fails when the formula (form) for the model is in the global environment, it works fine when it is specified in the command. The first example works as expected the second fails with a rather strange message.

As I'm not sure if this is an inherent limitation of finding out what is in what environment I did not submit it as a bug but thought I might make you aware of it anyway.

require(drake)
#> Loading required package: drake
wgd <- c(.3, .25)
form <- a ~ b
p1 <- drake_plan(
  data = target(data.frame(a = 1:n, b = runif(n)), transform =  map(n = !!c(50, 200))),
  md = target(
    {
    glm(a~b, data = data, weights =  data$a^ wgt)},
    transform =  cross(wgt = !!wgd, data)
  )
)
make(p1)
#> target data_50
#> target data_200
#> target md_0.3_data_50
#> target md_0.25_data_50
#> target md_0.3_data_200
#> target md_0.25_data_200
p2 <- drake_plan(
  data = target(data.frame(a = 1:n, b = runif(n)), transform =  map(n = !!c(50, 200))),
  md = target(
    {
      glm(form, data = data, weights = data$a^ wgt)},
    transform =  cross(wgt = !!wgd, data)
  )
)
make(p2)
#> target md_0.3_data_50
#> fail md_0.3_data_50
#> Error: Target `md_0.3_data_50` failed. Call `diagnose(md_0.3_data_50)` for details. Error message:
#>   object 'data_50' not found

Created on 2019-09-13 by the reprex package (v0.3.0)

@wlandau
Copy link
Collaborator

wlandau commented Sep 13, 2019

Unfortunately, this quirk is unavoidable. I am not sure what exactly causes it, but I do observe that it happens when the formula sits in a parent environment of the data. I can reproduce what you see without drake.

form <- a ~ b
eval(
  quote({
    my_data <- data.frame(a = 1:10, b = runif(10))
    glm(form, data = my_data, weights = my_data$a ^ .25)
  }),
  envir = new.env(parent = globalenv())
)
#> Error in eval(extras, data, env): object 'my_data' not found

my_data <- data.frame(a = 1:10, b = runif(10))
glm(form, data = my_data, weights = my_data$a ^ .25)
#> 
#> Call:  glm(formula = form, data = my_data, weights = my_data$a^0.25)
#> 
#> Coefficients:
#> (Intercept)            b  
#>       8.068       -3.490  
#> 
#> Degrees of Freedom: 9 Total (i.e. Null);  8 Residual
#> Null Deviance:       115.4 
#> Residual Deviance: 97.83     AIC: 53.41

Created on 2019-09-13 by the reprex package (v0.3.0)

drake deliberately builds targets in a child environment of your (usually global) environment. This precaution prevents drake from messing with your R session, and it is not going to change.

But the glm + formulas issue is surprising. Maybe else someone can explain why the formula for a glm() needs to be defined in the same environment as the data? I do observe that a formula has its own environment. Maybe that has something to do with it.

form <- a ~ b
str(form)
#> Class 'formula'  language a ~ b
#>   ..- attr(*, ".Environment")=<environment: R_GlobalEnv>

Created on 2019-09-13 by the reprex package (v0.3.0)

@wlandau-lilly
Copy link
Collaborator

A workaround is possible if you force the data and formula to be in the same environment:

library(drake)
suppressPackageStartupMessages(library(lme4))

fit_lmer <- function(dat) {
  envir <- environment()
  envir$dat <- dat
  f <- as.formula("Reaction ~ Days + (Days | Subject)", env = envir)
  lme4::lmer(f, data = dat)
}

plan <- drake_plan(
  dat = sleepstudy,
  mod = fit_lmer(dat)
)

make(plan)
#> ▶ target dat
#> ▶ target mod

Created on 2020-07-29 by the reprex package (v0.3.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants