Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using R_tryEval() instead of Rcpp_eval() #2295

Closed
krlmlr opened this issue Dec 7, 2016 · 22 comments
Closed

Consider using R_tryEval() instead of Rcpp_eval() #2295

krlmlr opened this issue Dec 7, 2016 · 22 comments
Assignees
Labels
feature a feature request or enhancement performance 🚀

Comments

@krlmlr
Copy link
Member

krlmlr commented Dec 7, 2016

If I understand correctly, this would offer a fail-safe alternative, but errors will most likely be printed to the console instead of captured in the exception object.

@kevinushey: Rcpp moved away from R_TopLevelExec() a year ago (RcppCore/Rcpp#311). What would be other drawbacks of using R_tryEval() instead of Rcpp_eval() if performance is important (RcppCore/Rcpp#578)? Thanks!

@kevinushey
Copy link
Contributor

The main issue is that R_tryEval() uses R_TopLevelExec() under the hood, which implies that any active R condition handlers won't actually be in effect while the associated code is executed. This means that code like:

tryCatch(
  topLevelExec(),
  error = function(e) print("Caught error")
)

won't actually work, as that code ends up evaluating in an entirely separate context (without active handlers visible). The original change in Rcpp ended up causing loads of problems in e.g. Shiny since many Rcpp functions were evaluated within condition handlers in this way.

If you plan on evaluating arbitrary user code, then R_TopLevelExec() is likely the wrong way to go.

@krlmlr
Copy link
Member Author

krlmlr commented Dec 7, 2016

Thanks. But I can detect errors raised by R_tryEval(), and call Rcpp::stop() in this case. Warnings are and other conditions are not caught. (See example at the end.)

The use case is the evaluation of arbitrary R code that doesn't need to interact with condition handlers defined before entering C++, such as aggregation functions in a summarise() or mutate() call:

aggregator <- function(x) { ... }
iris %>% group_by(Species) %>% summarize(agg = aggregator(Petal.Width))

They are called for each group, therefore performance is important. Also, they can (in theory) throw an error; I just want to exit safely in this case, and don't care much about the condition object (because the error is printed anyway).

Would that be a valid use case for R_tryEval() ?

R_tryEval() example

try_eval_ <-
  Rcpp::cppFunction(
    '
SEXP try_eval(SEXP call, SEXP env) {
  int error = 0;
  SEXP ret = R_tryEval(call, env, &error);
  if (error) Rcpp::stop("R_tryEval() raised error");
  return ret;
}
    ')

try_eval <- function(call, env = parent.frame()) try_eval_(substitute(call), env)

a <- 5
try_eval(1 + a)
## works fine

try_eval(stop("oops"))
## Rcpp::stop() called

tryCatch(try_eval(log(-1)), warning = function(e) print("warning caught"))
## warning handler not run

@kevinushey
Copy link
Contributor

kevinushey commented Dec 7, 2016

If you're okay with condition handlers not being caught through this mechanism (ie you don't plan on running user code which needs to catch messages or warnings) then it should be okay. Ie, you explicitly want to disallow something like:

withCallingHandlers(
    iris %>% group_by(Species) %>% summarize(agg = aggregator(Petal.Width)),
    message = function(m) print(m)
)

where aggregator is some function that might emit an R message that someone wants to catch.

The other thing you might need to double check: how do interrupt handlers work here? E.g. one can do something like:

tryCatch(
    Sys.sleep(100),
    interrupt = function(e) print("Caught interrupt")
)

# press C-c to interrupt

Just in case this is something that may be relevant.

@krlmlr
Copy link
Member Author

krlmlr commented Dec 7, 2016

Awesome, thanks. I tested:

try_eval(Sys.sleep(100))

An interrupt just got me to my Rcpp::stop() call, which is not ideal, but does the job.

@krlmlr
Copy link
Member Author

krlmlr commented Dec 7, 2016

@hadley: I'm thinking about using R_tryEval() instead of Rcpp::Rcpp_eval() for evaluating expressions in dplyr verbs. I have reason to expect substantial performance benefits, especially for grouped mutate/summary with regular evaluation for many groups. However, there are some drawbacks:

  • Errors can be propagated, but the error message is only printed to the console and cannot be propagated
  • Warnings are not propagated, but they are preserved
  • Messages cannot be captured or suppressed
  • Interrupts cannot be distinguished from errors (except by missing output on the console)

Thoughts?

try_eval_ <-
  Rcpp::cppFunction(
    '
SEXP try_eval(SEXP call, SEXP env) {
  int error = 0;
  SEXP ret = R_tryEval(call, env, &error);
  if (error) Rcpp::stop("R_tryEval() raised error");
  return ret;
}
    ')

try_eval <- function(call, env = parent.frame()) try_eval_(substitute(call), env)

tryCatch(
  try_eval({
    message("message")
    warning("warning")
    stop("stop")
  }),
  error = function(e) cat("error handled: ", conditionMessage(e), "\n", sep = ""),
  warning = function(e) print("warning handled"),
  message = function(e) print("message handled")
)
## message
## Error: stop
## In addition: Warning message:
## warning 
## error handled: R_tryEval() raised error

@hadley
Copy link
Member

hadley commented Dec 7, 2016

I think that seems like a reasonable trade-off. Might also be worth checking to see what data.table does?

@krlmlr
Copy link
Member Author

krlmlr commented Dec 7, 2016

data.table seems to do plain eval(), this is safe in C because they don't have destructors that need to be called when unwinding the stack after a longjmp...

@krlmlr krlmlr added data frame feature a feature request or enhancement labels Dec 7, 2016
@lionel-
Copy link
Member

lionel- commented Apr 8, 2017

Here is a package where they wrap a dplyr pipeline in a try block: https://github.com/business-science/tidyquant/blob/9fb6b93f07b4e1f940fec3c85e0bf46d5980f448/R/tq_stock_list.R#L178

@randy3k
Copy link

randy3k commented Nov 5, 2017

There is also R_tryCatchError now to catch possible error in user code. I guess that it could replace the usage of R_TopLevelExec.

PS: Oh, it has been discussed in RcppCore/Rcpp#578 and it was mentioned that the speed of R_tryCatchError might be low.

@krlmlr
Copy link
Member Author

krlmlr commented Jan 17, 2018

There's new hope with RcppCore/Rcpp#789 and its successors, we still can fallback to R_tryEval() for R < 3.5.0.

@krlmlr
Copy link
Member Author

krlmlr commented Jul 3, 2018

R 3.5 also has R_tryCatch(), but according to the documentation it also uses R-level tryCatch(), so not much to gain here.

https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Condition-handling-and-cleanup-code

@lionel-
Copy link
Member

lionel- commented Jul 3, 2018

I don't think we should fall back to R_tryEval() for R < 3.5.0 by the way. We need consistent semantics and it is ok if dplyr is a little less performant with old R versions.

@romainfrancois
Copy link
Member

This is now irrelevant with @lionel- work on Rcpp 0.12.18

@krlmlr
Copy link
Member Author

krlmlr commented Sep 15, 2018

Do we need to define RCPP_USE_UNWIND_PROTECT to enable it?

@krlmlr krlmlr reopened this Sep 15, 2018
@lionel-
Copy link
Member

lionel- commented Sep 15, 2018

Yes but perhaps better test-run it for a while? This is all very new, both in R and in Rcpp. We could set it on Travis to start with.

@krlmlr
Copy link
Member Author

krlmlr commented Sep 15, 2018

I like this idea. How do we proceed? It might be easiest if Travis would set an environment variable DPLYR_COMPILER_FLAGS which is then picked up by the build. Do we want to sneak in a ${DPLYR_COMPILER_FLAGS} into our Makevars ?

@lionel-
Copy link
Member

lionel- commented Sep 15, 2018

Oh hmm I thought we would need configure to make it work (cf suggestion on Slack) but that's much neater this way, good idea.

@lionel-
Copy link
Member

lionel- commented Sep 15, 2018

We should do the same on Appveyor, this jumping/throwing business is platform-sensitive.

@krlmlr
Copy link
Member Author

krlmlr commented Sep 15, 2018

I'll do it.

@krlmlr
Copy link
Member Author

krlmlr commented Sep 15, 2018

Travis build: https://travis-ci.org/tidyverse/dplyr/builds/428982506

AppVeyor build: https://ci.appveyor.com/project/tidyverse/dplyr/build/1.0.1722

Not opening a pull request to avoid checking the branch and the PR.

@krlmlr
Copy link
Member Author

krlmlr commented Sep 15, 2018

Interesting: https://travis-ci.org/tidyverse/dplyr/jobs/428993952#L1307

Error: segfault from C stack overflow

Let's watch out for these.

@lock
Copy link

lock bot commented Apr 3, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Apr 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement performance 🚀
Projects
None yet
Development

No branches or pull requests

6 participants