Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix protect-unwind #859

Merged
merged 13 commits into from
Jun 7, 2018
Merged

Fix protect-unwind #859

merged 13 commits into from
Jun 7, 2018

Conversation

lionel-
Copy link
Contributor

@lionel- lionel- commented Jun 1, 2018

  • Unshield the protect-unwind changes from Use protect-unwind API and add Rcpp_fast_eval() #789

  • Don't throw exceptions across the C stack. I believe this is what caused the issues on Windows. Instead we now skip the C frames using a longjump and only then throw the longjump exception across the C++ stack.

  • Add unit tests for generating attributes in packages. One mock package exports a function via Rcpp::interfaces(cpp) and the other calls it. The tests check that the C++ stack is properly unwound on longjumps issued from the exported function.

Win-builder and r-hub don't seem to be reliable for testing these changes. Even with current master the former always gets stuck and the latter randomly hangs when running the tests or rebuilding vignettes. However I have set up a local Windows 10 and Rcpp successfully passes the tests. With #789 I get a terminate called after throwing an instance of 'Rcpp::LongjumpException' crash. I don't know if all problems are fixed but hopefully we still have some time before the next release to test this further.

@codecov-io
Copy link

codecov-io commented Jun 1, 2018

Codecov Report

Merging #859 into master will decrease coverage by 0.02%.
The diff coverage is 9.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #859      +/-   ##
==========================================
- Coverage   90.49%   90.46%   -0.03%     
==========================================
  Files          70       70              
  Lines        3240     3251      +11     
==========================================
+ Hits         2932     2941       +9     
- Misses        308      310       +2
Impacted Files Coverage Δ
src/attributes.cpp 98.5% <ø> (+0.08%) ⬆️
inst/include/RcppCommon.h 100% <ø> (ø) ⬆️
inst/include/Rcpp/api/meat/Rcpp_eval.h 63.15% <0%> (ø) ⬆️
inst/include/Rcpp/exceptions.h 7.69% <0%> (-42.31%) ⬇️
inst/include/Rcpp/Environment.h 100% <100%> (ø) ⬆️
inst/include/Rcpp/Named.h 100% <0%> (+18.18%) ⬆️
inst/include/Rcpp/grow.h 100% <0%> (+33.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6886a43...a8c5463. Read the comment docs.

@eddelbuettel
Copy link
Member

Hi @lionel- -- thanks for working on this. Am a little tied up with R/Finance as well as the R 3.5.0 binary transition in Debian so it may take me a day or two or three to get to this. But much appreciated. If you have better / more ideas for additional test beds / platforms I'd be all ears.

@lionel-
Copy link
Contributor Author

lionel- commented Jun 2, 2018

No worries. It'd be helpful to have a few additional Travis targets with old R versions, ideally one for each major release supported by Rcpp. Windows CI with Appveyor would be great as well.

@eddelbuettel
Copy link
Member

Could script something using Rocker and older r-base builds? I am more worried about not-so-sane platforms ie the one starting with win and ending with dows, or our old friend Slowlaris, or ...

@lionel-
Copy link
Contributor Author

lionel- commented Jun 2, 2018

Could script something using Rocker and older r-base builds?

That'd be great. I forgot to check on old R versions and I just encountered a crash with R CMD check on R 3.4. This seems to be the same sourceCpp() crash as on win-builder with current master so is likely unrelated to protect-unwind. The good thing is that unlike on win-builder we got a traceback!

Edit: R CMD check passes on R 3.4 when the embeddedR test is disabled.

Executing test function test.embeddedR  ... 
> x <- foo()

> x
[1] 42

> x <- foo()

 *** caught segfault ***
address 0x382d465455, cause 'memory not mapped'

Traceback:
 1: .External(list(name = "InternalFunction_invoke", address = <pointer: 0x7f92a6c40a60>,     dll = list(name = "Rcpp", path = "/Users/lionel/Dropbox/Projects/R/misc/Rcpp.Rcheck/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x7f92a6f4c940>,         info = <pointer: 0x7f92a706de40>), numParameters = -1L),     <pointer: 0x7f92a6d1cff0>, ...)
 2: foo()
 3: eval(ei, envir)
 4: eval(ei, envir)
 5: withVisible(eval(ei, envir))
 6: source(file = srcConn, local = env, echo = TRUE)
 7: Rcpp::sourceCpp(file.path(path, "embeddedR2.cpp"), env = newEnv2)
 8: eval(expr, envir = parent.frame())
 9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg, "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && identical(getOption("show.error.messages"),         TRUE)) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
13: try(eval(expr, envir = parent.frame()), silent = silent)
14: inherits(try(eval(expr, envir = parent.frame()), silent = silent),     "try-error")
15: checkException(Rcpp::sourceCpp(file.path(path, "embeddedR2.cpp"),     env = newEnv2), " not available in other env")
16: func()
17: system.time(func())
18: doTryCatch(return(expr), name, parentenv, handler)
19: tryCatchOne(expr, names, parentenv, handlers[[1L]])
20: tryCatchList(expr, classes, parentenv, handlers)
21: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg, "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && identical(getOption("show.error.messages"),         TRUE)) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
22: try(system.time(func()))
23: .executeTestCase(funcName, envir = sandbox, setUpFunc = .setUp,     tearDownFunc = .tearDown)
24: .sourceTestFile(testFile, testSuite$testFuncRegexp)
25: runTestSuite(testSuite)
An irrecoverable exception occurred. R is aborting now ...

namespace Rcpp {
namespace internal {

#ifdef RCPP_USE_PROTECT_UNWIND

// Store the jump buffer as a static variable in function scope
// because inline variables are a C++17 extension.
inline std::jmp_buf* get_jmpbuf_ptr() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it okay for this to be an inline static? It seems like things could go wrong if there were recursive calls to Rcpp_eval(), e.g. if you were to Rcpp_eval() an expression that itself called R functions that call Rcpp_eval().

I wonder if instead this could be part of the EvalData struct or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh, good point! I will write a failing test and fix this, the jumpbuf should indeed be on the stack and we'd pass a pointer as data to the cleanup callback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lionel- -- do you plan to add that test to this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes pretty soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, no rush. Was just wondering as there was such a rush here, and then a slowdown.

struct EvalData {
SEXP expr;
SEXP env;
EvalData(SEXP expr_, SEXP env_) : expr(expr_), env(env_) { }
};

// First jump back to the protected context with a C longjmp because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Did you already test this on 32bit Windows? I bet this fixes the crash there (cc: @jeroen, since we were looking at this before and it looks like @lionel- has a nice solution)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the i386 checks performed by R CMD check sufficient? Or should I set up another virtual box with a 32 bit Windows to check that architecture properly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think that will be sufficient. I also validated this works as expected on my own VM. Nice!

One other thought: do we need to worry about this interfering with other ways that R might jump (e.g. interrupts, browser)? I think we discussed this already and we're pretty sure that we'll do the right thing as long as R_ContinueUnwind() is called but curious if you've already thought through these scenarios.

Copy link
Contributor Author

@lionel- lionel- Jun 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to worry about this interfering with other ways that R might jump (e.g. interrupts, browser)?

All sorts of longjumps are covered by protect-unwind.

I've been wondering about longjumps occurring in destructors during a LongjumpException unwind. According to R semantics the last longjump should win but in C++ it's not possible to rethrow from a destructor. I think if C++11's current_exception() always returned a reference to the current exception we could update the token but unfortunately this is implementation-dependent and the exception_ptr might point to a copy.

So I think code in destructors that might cause longjumps should be unwind-protected and guarded with catch(...), and the new longjumps should simply be discarded. The original token contains all the information needed to accomplish the first longjump safely. Too bad that exception handling in C++ is so wonky and we can't do better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately this is implementation-dependent and the exception_ptr might point to a copy.

oh hmm, even if the exception is copied it will still point to the same token SEXP, so we could just memcopy all the RAWSXP data from one token into the other. But then the structure of the token should become part of the API.

It may not be worth worrying about this too much as it's unlikely that complex R code will be run from destructors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most C++ destructors will not (or at least, should not) contain lots of R code. I imagine most destructors out there that do interact with R from a destructor involve R object lifetime / resource management (e.g. calling R_ReleaseObject(); or maybe cleaning up external pointers or something similar)

I think we don't need to worry about this too much since at least even with this PR we still have a substantial improvement over the status quo; anyone who does run into a problem of this form could likely work around it.

I think we can punt on this issue until we hear of a reported problem in the wild.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes good sense, thanks! We used to have complex R code in a dplyr dtor but that is no longer the case IIRC.

namespace internal {

inline SEXP longjumpSentinel(SEXP token) {
SEXP sentinel = PROTECT(Rf_allocVector(VECSXP, 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why we're using a length-one list rather than a length-one character vector directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentinel wraps the longjump token that is passed to R_ContinueUnwind().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Sorry for the silly question.

@eddelbuettel
Copy link
Member

No issues whatsoever on a full rev.dep check on Linux.

Do we need to test this more on other platforms? Any views, @kevinushey @lionel- ?

@lionel-
Copy link
Contributor Author

lionel- commented Jun 5, 2018

Do we need to test this more on other platforms?

I think we should also test with RCPP_PROTECTED_EVAL enabled, once the fix for nested protected eval is pushed.

@lionel-
Copy link
Contributor Author

lionel- commented Jun 5, 2018

Last commit implements stack-local jump buffers.

@eddelbuettel Could you run the revdeps again with #define RCPP_PROTECTED_EVAL on top of Rcpp.h please?

@lionel-
Copy link
Contributor Author

lionel- commented Jun 5, 2018

@eddelbuettel do your revdeps run on 3.4 or 3.5? I noticed today that your Travis still runs on 3.4.

@eddelbuettel
Copy link
Member

@lionel- For the revdeps I still use 3.4.4. It is a Debian box and we are in the middle of the transition; it mostly uses local CRAN packages but I keep it "stateful" (ie not what your revdepper in devtools does as I don't have all the time in the world to wait for the depends of 1360+ package to get reinstalled each time) and for that I need a rebuild of "all". Haven't had time yet. But will do -- both the new rev.dep, and "eventually" the switch to 3.5.0

That said, you could play now with Rocker images. We have a bunch of 3.* images of base. As Rcpp has no depends itself it should be quick-ish to set up.

@lionel-
Copy link
Contributor Author

lionel- commented Jun 5, 2018

That said, you could play now with Rocker images. We have a bunch of 3.* images of base. As Rcpp has no depends itself it should be quick-ish to set up.

Will all 1400 revdeps build cleanly with these images? Just asking out of curiosity, I think this can wait until your box is updated.

@eddelbuettel
Copy link
Member

@lionel- That may have been a misunderstanding.

For rev.deps, I do "full one-level horizontol mode", ie all direct import/depends/linkingto. Not recursive. Large-ish skip list for unbuildable or otherwise stooopid packages (maybe 60, list is public).

But it may help to check "vertical" against other R version but I think it would wholly sufficient to just do a full R CMD check Rcpp_0.12.17.$X.orig.tar.gz for one each of R 3.4., 3.3., 3.2.*, ...

@kevinushey
Copy link
Contributor

LGTM with the most recent commit!

@lionel-
Copy link
Contributor Author

lionel- commented Jun 6, 2018

@eddelbuettel I have now run R CMD check on old R versions up until 3.1. Besides the fact that I randomly run into the same problems as on win-builder and r-hub (really long vignette building time, (when I can build those) or the embeddedR crash), they all pass except the 3.1 build which assumes existence of the new trimws() version:

1 Test Suite : 
Rcpp Unit Tests - 600 test functions, 3 errors, 0 failures
ERROR in test.sugar.mtrimws: Error in mode(current) : could not find function "trimws"
ERROR in test.sugar.strimws: Error in match.fun(FUN) : object 'trimws' not found
ERROR in test.sugar.vtrimws: Error in mode(current) : could not find function "trimws"

@eddelbuettel
Copy link
Member

@lionel- That is awesome, thanks for doing that. I'd call that a thumbs-up then. Weird about the long vignette time though.

@eddelbuettel
Copy link
Member

No new issues in rev.dep with the change below (on top of the patch) so this is going in now.

edd@debbuilder:~/git/rcpp(pr/859)$ git diff inst/include/Rcpp.h
diff --git a/inst/include/Rcpp.h b/inst/include/Rcpp.h
index 0d2b9dbc..4951ac8c 100644
--- a/inst/include/Rcpp.h
+++ b/inst/include/Rcpp.h
@@ -23,6 +23,9 @@
 #ifndef Rcpp_hpp
 #define Rcpp_hpp

+// enable new feature
+#define RCPP_PROTECTED_EVAL
+
 /* it is important that this comes first */
 #include <RcppCommon.h>

edd@debbuilder:~/git/rcpp(pr/859)$

@eddelbuettel eddelbuettel merged commit d1674cf into RcppCore:master Jun 7, 2018
@lionel- lionel- deleted the restore-unwind branch June 9, 2018 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants