Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when passing optional arguments to cross_validate #16

Closed
osofr opened this issue Jun 28, 2017 · 10 comments
Closed

Error when passing optional arguments to cross_validate #16

osofr opened this issue Jun 28, 2017 · 10 comments
Assignees
Labels

Comments

@osofr
Copy link

osofr commented Jun 28, 2017

Minimal example below, taken from vignette. The function cross_validate currently errors out whenever I try to pass any optional arguments to cv_fun . The error appears to be downstream with future package. This is crucial to my functionality, I have to pass optional arguments to cv_fun. Any advice on work-arounds?

library(origami)
cvlm <- function(fold, mydata) {
    train_data <- training(mydata)
    valid_data <- validation(mydata)

    mod <- lm(mpg ~ ., data = train_data)
    preds <- predict(mod, newdata = valid_data)
    list(coef = data.frame(t(coef(mod))), SE = ((preds - valid_data$mpg)^2))
}

# cross-validated estimate
folds <- make_folds(mtcars)
results <- cross_validate(cvlm, folds, mtcars)
results <- cross_validate(cvlm, folds, mydata = mtcars)

In general, I think it might not be the best practice to provide an example with a function cvlm that depends on some object (mtcars) that was defined in the calling environment of this function. This creates quite a bit of confusion and makes it hard to read the code. That's just my opinion though.

@ck37
Copy link
Contributor

ck37 commented Jun 28, 2017

Yep I ran into that exact same issue myself, and reported it here (see workarounds): futureverse/future#156

@nhejazi
Copy link
Member

nhejazi commented Jun 28, 2017

Thanks for reporting this error, Oleg. And, thanks for having already reported this error in future to Henrik, Chris.

I'm honestly not sure of a quick fix off of the top of my head, but can do a bit of digging and report anything back here. Based on the conversation in the issue reported to future, it looks like Henrik has plans to fix this soon.

Also, thanks for pointing out that slip up in the definition of the cvlm function. I definitely agree that having the function defined in that way is bad practice and will update that example (and any others like it) to reflect this.

@nhejazi nhejazi added the bug label Jun 28, 2017
@nhejazi nhejazi self-assigned this Jun 28, 2017
@jeremyrcoyle
Copy link
Collaborator

future.globals=FALSE should work fine for plan(sequential) and plan(multicore), but for plan(multisession) you'll need to specify the globals and packages needed by the different sessions manually. You can see an example of that here: https://github.com/jeremyrcoyle/SuperLearner/blob/integrate_origami/tests/testthat/test-origami_SuperLearner.R#L39

This seems like something of a regression from when we were using foreach. Depending on how long the underlying issue takes to get fixed, we might want to consider going back to foreach or having an option to use either foreach or future. Thoughts?

@nhejazi
Copy link
Member

nhejazi commented Jun 29, 2017

Ok -- good to know that you encountered this problem in the origami-SL integration as well.

Given that its unclear when the error might be resolved on the foreach side, the most sensible thing seems to be to restore the foreach implementation, having either an option for users to specify use of foreach v. future (as you suggested), or simply having foreach be a failsafe for any problems that might arise from use of future (the latter assumes that the foreach implementation was fairly robust and suggests something in the style of try-catch).

@ck37
Copy link
Contributor

ck37 commented Jun 29, 2017

This wasn't encountered in the original SL implementation, it was added today: https://github.com/jeremyrcoyle/SuperLearner/commit/b0b3201befb7c262a8a4220ce45e403903056c14

With examples (and/or tests) of the parallelization options in origami it would have been identified though.

@jeremyrcoyle
Copy link
Collaborator

@ck37 I've been aware of the issue for a while now (see https://app.asana.com/0/270379281729322/369465305300769/f), but haven't had time to look into it in more detail (I'm currently on vacation). There's a commented out test here: https://github.com/jeremyrcoyle/origami/blob/master/tests/testthat/test_future_plan.R#L42 that is disabled because it was failing in some environments, and it didn't seem to make sense to delay our package's release due to an upstream issue we don't yet fully understand/have to wait for Henrik to fix.

@ck37
Copy link
Contributor

ck37 commented Jun 30, 2017

Henrik has fixed the bug: futureverse/future#156

@nhejazi
Copy link
Member

nhejazi commented Jul 1, 2017

Looks like we can move to close this issue once we confirm that the change in globals resolved our problem (after installing globals via remotes::install_github("HenrikBengtsson/globals@develop") as per Henrik's recommendation).

It sounds like he's planning a new CRAN release, so we should be ok without having to mess with how dependencies are currently specified for origami.

@nhejazi nhejazi closed this as completed Jul 15, 2017
@nhejazi nhejazi reopened this Jul 15, 2017
@nhejazi
Copy link
Member

nhejazi commented Jul 15, 2017

Closing since globals 0.10.1 has been added to CRAN and made a dependency for future.

@nhejazi nhejazi closed this as completed Jul 15, 2017
@nhejazi
Copy link
Member

nhejazi commented Jul 16, 2017

The issue with the example cv_funs relying on objects defined in the calling environment, as noted in the original post, has been resolved in all examples as of 64569a8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants