New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use protect-unwind API and add Rcpp_fast_eval() #789

Merged
merged 5 commits into from Dec 16, 2017

Conversation

Projects
None yet
4 participants
@lionel-
Contributor

lionel- commented Dec 14, 2017

  • Use protected evaluation when compiling Rcpp against R 3.5 (see wch/r-source@c8cc608). This prevents leaks when R code jumps (caught condition, restart invokation, interrupt, return, error).

  • Add Rcpp_fast_eval(). Unlike Rcpp_eval(), it doesn't wrap the code in tryEval() to avoid the catching overhead. It falls back to Rcpp_eval() when compiled against old versions of R so packages can use it without worrying about backward compatibility.

    Here are the differences between the new function and Rcpp_eval():

    • Better performance.
    • R errors are not rethrown on the C++ stack as Rcpp::condition. Instead they are rethrown as Rcpp::internal::LongjumpException as with all other longjump causes.
    • It has the control flow semantics of the C function Rf_eval() rather than the R function base::eval(). E.g. you can return from a frame environment on the R stack.

These changes should be fully backward compatible. I have tested this branch on R oldrel, release and devel here: https://travis-ci.org/lionel-/Rcpp/builds/316466207 (without vignettes and with the binary package test disabled). I'm not sure how to adapt the current Travis config to use R devel.

I have checked this branch with dplyr (which calls into R a lot) on both R release and R devel. I have not run the reverse dependencies checks. Would it be possible to run those on your server Dirk?

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 14, 2017

Would it be possible to run those on your server Dirk?

Sure! But that is only using r-release. So we'd know of regressions. No more, no less.

Edit: Now turned on.

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 14, 2017

At first it may be best to condition the use of the protect-unwind API with an explicit #define.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 14, 2017

Which is not in the PR, correct? So why rev.dep check runs with it on unconditionally?

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 14, 2017

Your revdeps should opt out automatically of the new API because they are compiled against R release. I am wondering whether it's best to also opt out of it by default on R 3.5 until R_UnwindProtect() gets more widely tested. Maybe that's not necessary if the next release of Rcpp comes after 3.5 is out and the revdep checks with protected evaluation enabled turn out fine.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 14, 2017

We tend(ed) to release Rcpp every two months. By that timeline it will be January and March before R 3.5.

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 14, 2017

Ok, so it seems best to condition so it is not automatically enabled when R 3.5 comes out. Does #define RCPP_PROTECTED_UNWIND sound good? Cf last commit.

Protect token while unwinding the stack
Because R code might run in C++ destructors
@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 15, 2017

Reverse depends concluded, I just committed the log and summary.

No compilation errors. A few new errors in testing from testthat which may be due to changes in that package. I will try to find some time test those against stock Rcpp later.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 15, 2017

We're good. Those ten new ones were all spurious (two missing depends, one phantom gone under re-run, seven apparently due to testthat as they also fail under previous Rcpp tests).

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 15, 2017

Great. Do you think we could get this merged for the next Rcpp release? Having it on CRAN will make it easier to start testing R_ProtectUnwind() in our packages by enabling RCPP_PROTECT_UNWIND on the R devel Travis builds.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 15, 2017

I think I'll just merge it as it, and we build from here. I trust the subsequent commits you made; I made let it run here once more with it.

Looks like a job well done (though we'll only know for sure with R 3.5.*). And you of course get extra brownie points for embracing the Emacs-ish ChangeLog use. At long last someone does :)

@kevinushey

This comment has been minimized.

Contributor

kevinushey commented Dec 15, 2017

Maybe not for this PR, but what do you think about making Rcpp_eval() and Rcpp_fast_eval() behave the same; ie, letting Rcpp_eval() use the new implementation (for R >= 3.5.0)?

Our Rcpp_eval() implementation basically invokes something of this form at the R level:

tryCatch(evalq(<expr>, <envir>), warning = identity, error = identity, interrupt = identity)

The intention of this code is to ensure that R longjmps are 'blocked' at this scope, so that longjmps never go 'up' the C++ stack. How does having the tryCatch() handlers active interact with the longjmp unwind protection here?

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 15, 2017

what do you think about making Rcpp_eval() and Rcpp_fast_eval() behave the same

One potential problem is that Rcpp::condition is not internal so there's probably existing code that tries to handle it at C++ level. We could assess the feasibility of this by changing the type of the thrown condition to Rcpp::internal::condition in a branch and running the revdeps on it.

How does having the tryCatch() handlers active interact with the longjmp unwind protection here?

There is no interaction for errors and interrupts because the tryCatch prevents the R code from jumping in these cases. The stack is still unwound with the existing mechanism (even if the error was a normally returned condition and not the result of catching an actual error/interrupt jump). All other longjump causes are detected and a Rcpp::internal::LongjumpException is thrown.

@kevinushey

This comment has been minimized.

Contributor

kevinushey commented Dec 15, 2017

One potential problem is that Rcpp::condition is not internal so there's probably existing code that tries to handle it at C++ level. We could assess the feasibility of this by changing the type of the thrown condition to Rcpp::internal::condition in a branch and running the revdeps on it.

Fair enough. We might want to give that a shot later (in a separate PR).

There is no interaction for errors and interrupts because the tryCatch prevents the R code from jumping in these cases. The stack is still unwound with the existing mechanism (even if the error was a normally returned condition and not the result of catching an actual error/interrupt jump). All other longjump causes are detected and a Rcpp::internal::LongjumpException is thrown.

Okay, awesome! Thanks for clarifying.

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 15, 2017

By the way, if I'm not mistaken we could replace the evalq() with the quote() primitive and evaluate the tryCatch in the user-supplied environment (with the tryCatch() closure inlined in the expression in case the environment does not inherit from base or masks base::tryCatch()).

This should improve performance a bit and give Rcpp_eval() the semantics of Rf_eval() (which matters for stack-sensitive expressions like return(), sys.xxx(), etc). Code that assumes a specific call stack trace would have to be adapted though.

Edit: Facilitating long returns in this way would cause new kinds of leaks until the safe API is enabled.

@kevinushey

This comment has been minimized.

Contributor

kevinushey commented Dec 15, 2017

By the way, if I'm not mistaken we could replace the evalq() with the quote() primitive and evaluate the tryCatch in the user-supplied environment (with the tryCatch() closure inlined in the expression in case the environment does not inherit from base or masks base::tryCatch()).

This should improve performance a bit and give Rcpp_eval() the semantics of Rf_eval() (which matters for stack-sensitive expressions like return(), sys.xxx(), etc). Code that assumes a specific call stack trace would have to be adapted though.

Unfortunately, I'm fairly sure this is code of that form out there already in R packages so we might not be able to make that change, unless we had a convincing example of commonly-written R/Rcpp code that does the wrong thing in the current scheme.

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 15, 2017

Ok, so it seems that merging Rcpp_fast_eval() and Rcpp_eval() is not feasible as it would have the same problem by cleaning the call stack trace even further.

@kevinushey

This comment has been minimized.

Contributor

kevinushey commented Dec 15, 2017

Fair enough. Perhaps we can just document for new users that Rcpp_fast_eval() is preferred for R packages targeting R >= 3.5.0.

@lionel-

This comment has been minimized.

Contributor

lionel- commented Dec 15, 2017

Yup, probably best to wait until R 3.5 is out and the protected API enabled by default in Rcpp.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Dec 15, 2017

(Mostly just nodding along...)

@eddelbuettel eddelbuettel merged commit b9a73ba into RcppCore:master Dec 16, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 14, 2018

Hi @lionel quick question: was some of the material in the PR activated for stack unwinding?

With r-devel changing so much I had not pushed to win-builder in some time, and this evening I just got an error ending on

Executing test function test.stackUnwinds  ... 
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
terminate called after throwing an instance of 'Rcpp::internal::LongjumpException'

on 32 bit mode on r-devel. Full link is here for the usual time that win-builder keeps it. Could you take a look and see if it may be related?

@lionel-

This comment has been minimized.

Contributor

lionel- commented Jan 14, 2018

was some of the material in the PR activated for stack unwinding?

Only in the test.stackUnwinds unit test. From the message it seems like the LongjumpException is not caught by the wrapping macros. I just tried on R-hub (windows-x86_64-devel) and it seems to work properly (some unit tests fail but not due to exception catching) https://artifacts.r-hub.io/Rcpp_0.12.14.5.tar.gz-ca8f8d1eaf454fdea479be9b68bbeec4/Rcpp.Rcheck/00check.log

I'm not sure how the platform can impact the wrapping macros or exception catching. I am on vacations for one week and then I'll be busy preparing a webinar, so I won't have time to look into this further until January 25th.

One easy way around this is to disable the test for now. It'd also be better to extract the test in its own file so the protection stuff is not enabled for the other tests in runit.misc.R / misc.cpp. I'll send a PR to that effect.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 14, 2018

Ok, noted on the timing--bad luck, and my fault for not trying windows r-devel earlier, despite the current changes.

As it is about release time for Rcpp, I will indeed just condition out the test. As it works on Linux, I will only run it there

Don't send a PR just to isolate one test into another, that's not worth it. Enjoy the vacation... Once we work on re-activating this code with R 3.5.0 we can move tests into their file.

PS I had done a bisection and also uploaded Rcpp from just before your patches, which passes.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 14, 2018

Turns out there may be more trouble.

Your test is inactive, but I still have trouble with windows tests on either i386 or x86_64. Very, very weird. Still in the middle of it.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

I have the feeling we had seen 32 and 64 bit failures. But maybe not -- all I triggered was via joint build on, respectively, win-builder or r-hub.

@krlmlr

This comment has been minimized.

Contributor

krlmlr commented Jan 15, 2018

I have sent a few instances of b52f690 to win-builder, will post results.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

I am currently doing that too. Sucketh big time because each iteration is 30+ minutes.

The worst is that I have two complete sets of what is now HEAD which failed once and passed once. The same tarball ...

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

More current thread also over there at #801.

@krlmlr

This comment has been minimized.

Contributor

krlmlr commented Jan 15, 2018

Not if you send with different versions ;-) And devtools::build_win() is a big time-saver for me.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

As I said over there, the same tarball got both a fail and a pass at two different points in time.

What does build_win() do? I have a two-line shell script (well, plus arg checking and help) that uses ncftp to copy a tarball in batch to both instances at win-builder.

@krlmlr

This comment has been minimized.

Contributor

krlmlr commented Jan 15, 2018

build_win() probably does the same as your shell script.

I've sent five tarballs that are identical apart from the version number. Enough for a majority vote ;-)

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

Good experimental design :)

@kevinushey

This comment has been minimized.

Contributor

kevinushey commented Jan 15, 2018

I e-mailed Luke and his response was effectively "ask @lionel-" ... so color me frustrated as well.

I think we should remove the use of R_UnwindProtect() altogether for this release until we hear back from @lionel- when he gets back from vacation.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

I am about done editing this locally. Let me test it once or twice here, then push the branch.

I basically left all he added, but #ifdef-ed it away. This should work. Unless I tripped myself up.

eddelbuettel added a commit that referenced this pull request Jan 15, 2018

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

Ok, new branch is here and I would definitely appreciate another few sets of eyes.

"Works here", submitted to win-builder and r-hub.

@lionel-

This comment has been minimized.

Contributor

lionel- commented Jan 15, 2018

Works fine on macOS, works fine on 64bit R on Windows, crashes on 32bit R on Windows as the exception is not caught. @lionel- does this seem like a minimal enough example that we might want to let some members of R Core know?

Interesting. Maybe we should look at this together at the work week?

In any case this doesn't explain the trouble caused by this branch because I'm pretty sure that R_UnwindProtect() was not called at all (after the tests were conditioned out).

Thanks for all your work sorting this out @eddelbuettel.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 15, 2018

No worries. Your patch meant to do the right thing, and we'll get there. I just had to pay extra today and yesterday because I don't usually run CI on Windows, and had not submitted "occassionally" as I usually do as R-devel changed to much of late. Lesson learned.

eddelbuettel added a commit that referenced this pull request Jan 16, 2018

Merge pull request #802 from RcppCore/feature/condition_unwind_protect
shielding #789 behind a new #define to effectively disable it now
@lionel-

This comment has been minimized.

Contributor

lionel- commented Jan 16, 2018

FWIW I have sent Rcpp 0.12.14 to win-builder and got a timeout while rebuilding the vignettes: https://win-builder.r-project.org/112Ax81s7Dib/00check.log

Also R-hub times out in the tests with that same version. This seems to rule out this PR as the cause of undeterministic failures.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 16, 2018

No, that is precisely the issue. One of the vignettes runs all the unit tests and the fact that it doesn't come back absolutely is related to this issue. If you feel like it, try 0.12.13, or try what it is now in master and waiting at CRAN as 0.12.15.

R-hub is likely a different issue indeed.

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 16, 2018

If you care, this is what I got this morning before submitting this to CRAN as 0.12.15:

r-devel

From: Uwe.Ligges@R-Project.org
To: edd@debian.org
Cc: Uwe.Ligges@R-Project.org
Subject: winbuilder: Package Rcpp_0.12.15.tar.gz has been checked and built
Date: Tue, 16 Jan 2018 13:35:30 +0100

Dear package maintainer,
 
this notification has been generated automatically.
Your package Rcpp_0.12.15.tar.gz has been built (if working) and checked for Windows.
Please check the log files and (if working) the binary package at:
https://win-builder.r-project.org/w4r8LSs3H6eY
The files will be removed after roughly 72 hours.
Installation time in seconds: 89
Check time in seconds: 717
Status: 4 NOTEs
R Under development (unstable) (2018-01-15 r74124)
 
All the best,
Uwe Ligges
(CRAN maintainer of binary packages for Windows)

r-release


From: Uwe.Ligges@R-Project.org
To: edd@debian.org
Cc: Uwe.Ligges@R-Project.org
Subject: winbuilder: Package Rcpp_0.12.15.tar.gz has been checked and built
Date: Tue, 16 Jan 2018 13:37:07 +0100

Dear package maintainer,
 
this notification has been generated automatically.
Your package Rcpp_0.12.15.tar.gz has been built (if working) and checked for Windows.
Please check the log files and (if working) the binary package at:
https://win-builder.r-project.org/zGdIH49Vy33X
The files will be removed after roughly 72 hours.
Installation time in seconds: 85
Check time in seconds: 720
Status: 5 NOTEs
R version 3.4.3 (2017-11-30)
 
All the best,
Uwe Ligges
(CRAN maintainer of binary packages for Windows)
@lionel-

This comment has been minimized.

Contributor

lionel- commented Jan 16, 2018

One of the vignettes runs all the unit tests and the fact that it doesn't come back absolutely is related to this issue.

But the version I sent was anterior to this PR. Maybe the 4th version component causes too many tests to be run while building the vignette?

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 16, 2018

Here are my logs from yesterday. Note that r-hub essentially always breaks. The key is that

  • before your PR winbuilder passes on both
  • after your PR r-rel passes, r-dev fails
  • this was a bit a dump of conciousness, I did not run all in order or write entries in order
  • we did conclude that win-builder also had some flakyness (but cf @krlmlr getting "five for five")
  • we all concluded that the PR was not good at this time (cf @kevinushey comments)
- 0.12.14.1 just after 0.12.14
    - now in ~/git/rcpp-is-fucked-up/000_release_0.12.14/
    - sent to win-builder @ 10:32
    - 40-ish mins later PASS on r-devel and r-release
        
- 0.12.14.3 was pivoted just before Lionel's pr789
    - now checked out in ~/git/rcpp-is-fucked-up/010_before_789_after_785/Rcpp
    - PASSed win builder on 01/14
    - times out at r-hub, indicating r-hub may not be quite right

- 0.12.14.3.1 (and 0.12.14.35) was just after Lionel's pr789 
    - branch 'local/before790' at 1630a55b
    - added skip of Lionel's test
    - http://builder.r-hub.io/status/Rcpp_0.12.14.3.1.tar.gz-c3812d0bcbcd430c8f691c6abb3a3191
    - times out at rhub
    - built again at 1630a55b in ~/git/rcpp-is-fucked-up/020_after_789_before_790
    - calling this 0.12.14.35
    - sent to win-builder @ 11:28
    - 30+ mins later PASS r-release
    - 60+ mins later FAIL r-devel (note that we had NOT remove his tests)

- 0.12.14.4 was just after #790 by Kevin
    - added skip of Lionel's test
    - http://builder.r-hub.io/status/Rcpp_0.12.14.4.tar.gz-614f7dd1ed444823bb5aedf5404d44e9
    - times out at rhub 
    
- 0.12.14.7 was another more minimalistic attempt "from the current end" without unwinw
    - times out at still at rhub
        
- 0.12.14.7.1 was `git checkout -b local/beforePR797`
    - times out at rhub
      
- 0.12.14.8 ("current" as of 01/14 and 01/15)
    - failed win-builder (r-rel,r-dev) on 01/14
    - PASSed win-builder (r-rel,r-dev) on 01/15

- 0.12.14.8 (with new #ifdef)
    - sent to win-builder at 14:40 or so
    - on r-hub three WARNING (normal) and one ERROR (with UBSAN, no actual error, also normal)
@lionel-

This comment has been minimized.

Contributor

lionel- commented Jan 16, 2018

we all concluded that the PR was not good at this time

Agreed. But we still need to figure out whether the PR really is the cause of the nondeterministic failures. Aren't you concerned that the failure also occurs with the previous CRAN release?

@eddelbuettel

This comment has been minimized.

Member

eddelbuettel commented Jan 16, 2018

Aren't you concerned that the failure also occurs with the previous CRAN release?

Well, it doesn't. First entry:

- 0.12.14.1 just after 0.12.14
    - now in ~/git/rcpp-is-fucked-up/000_release_0.12.14/
    - sent to win-builder @ 10:32
    - 40-ish mins later PASS on r-devel and r-release

PASS on r-devel and r-release is where we started, with a snapshot just after 0.12.14.

I do not think that what you and I are doing right now is productive use of my time, or your vacation. We all had determined that the issue was with #789. You can fix it, or throw your arms in the air. But please do not make me re-visit every aspect now. We all missed that this comes up a lot on Windows, but also see @kevinushey above (or maybe in the #801 thread) stating that it also took his RStudio on OS X down though not each time.

Maybe we should just wait til this stabilizes in R-devel.

lionel- added a commit to lionel-/Rcpp that referenced this pull request May 30, 2018

lionel- added a commit to lionel-/Rcpp that referenced this pull request May 30, 2018

lionel- added a commit to lionel-/Rcpp that referenced this pull request May 30, 2018

lionel- added a commit to lionel-/Rcpp that referenced this pull request Jun 1, 2018

@lionel- lionel- referenced this pull request Jun 1, 2018

Merged

Fix protect-unwind #859

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment