Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use protect-unwind API and add Rcpp_fast_eval() #789

Merged
merged 5 commits into from
Dec 16, 2017

Conversation

lionel-
Copy link
Contributor

@lionel- lionel- commented Dec 14, 2017

  • Use protected evaluation when compiling Rcpp against R 3.5 (see wch/r-source@c8cc608). This prevents leaks when R code jumps (caught condition, restart invokation, interrupt, return, error).

  • Add Rcpp_fast_eval(). Unlike Rcpp_eval(), it doesn't wrap the code in tryEval() to avoid the catching overhead. It falls back to Rcpp_eval() when compiled against old versions of R so packages can use it without worrying about backward compatibility.

    Here are the differences between the new function and Rcpp_eval():

    • Better performance.
    • R errors are not rethrown on the C++ stack as Rcpp::condition. Instead they are rethrown as Rcpp::internal::LongjumpException as with all other longjump causes.
    • It has the control flow semantics of the C function Rf_eval() rather than the R function base::eval(). E.g. you can return from a frame environment on the R stack.

These changes should be fully backward compatible. I have tested this branch on R oldrel, release and devel here: https://travis-ci.org/lionel-/Rcpp/builds/316466207 (without vignettes and with the binary package test disabled). I'm not sure how to adapt the current Travis config to use R devel.

I have checked this branch with dplyr (which calls into R a lot) on both R release and R devel. I have not run the reverse dependencies checks. Would it be possible to run those on your server Dirk?

@eddelbuettel
Copy link
Member

eddelbuettel commented Dec 14, 2017

Would it be possible to run those on your server Dirk?

Sure! But that is only using r-release. So we'd know of regressions. No more, no less.

Edit: Now turned on.

@lionel-
Copy link
Contributor Author

lionel- commented Dec 14, 2017

At first it may be best to condition the use of the protect-unwind API with an explicit #define.

@eddelbuettel
Copy link
Member

Which is not in the PR, correct? So why rev.dep check runs with it on unconditionally?

@lionel-
Copy link
Contributor Author

lionel- commented Dec 14, 2017

Your revdeps should opt out automatically of the new API because they are compiled against R release. I am wondering whether it's best to also opt out of it by default on R 3.5 until R_UnwindProtect() gets more widely tested. Maybe that's not necessary if the next release of Rcpp comes after 3.5 is out and the revdep checks with protected evaluation enabled turn out fine.

@eddelbuettel
Copy link
Member

We tend(ed) to release Rcpp every two months. By that timeline it will be January and March before R 3.5.

@lionel-
Copy link
Contributor Author

lionel- commented Dec 14, 2017

Ok, so it seems best to condition so it is not automatically enabled when R 3.5 comes out. Does #define RCPP_PROTECTED_UNWIND sound good? Cf last commit.

Because R code might run in C++ destructors
@eddelbuettel
Copy link
Member

eddelbuettel commented Dec 15, 2017

Reverse depends concluded, I just committed the log and summary.

No compilation errors. A few new errors in testing from testthat which may be due to changes in that package. I will try to find some time test those against stock Rcpp later.

@eddelbuettel
Copy link
Member

We're good. Those ten new ones were all spurious (two missing depends, one phantom gone under re-run, seven apparently due to testthat as they also fail under previous Rcpp tests).

@lionel-
Copy link
Contributor Author

lionel- commented Dec 15, 2017

Great. Do you think we could get this merged for the next Rcpp release? Having it on CRAN will make it easier to start testing R_ProtectUnwind() in our packages by enabling RCPP_PROTECT_UNWIND on the R devel Travis builds.

@eddelbuettel
Copy link
Member

I think I'll just merge it as it, and we build from here. I trust the subsequent commits you made; I made let it run here once more with it.

Looks like a job well done (though we'll only know for sure with R 3.5.*). And you of course get extra brownie points for embracing the Emacs-ish ChangeLog use. At long last someone does :)

@kevinushey
Copy link
Contributor

Maybe not for this PR, but what do you think about making Rcpp_eval() and Rcpp_fast_eval() behave the same; ie, letting Rcpp_eval() use the new implementation (for R >= 3.5.0)?

Our Rcpp_eval() implementation basically invokes something of this form at the R level:

tryCatch(evalq(<expr>, <envir>), warning = identity, error = identity, interrupt = identity)

The intention of this code is to ensure that R longjmps are 'blocked' at this scope, so that longjmps never go 'up' the C++ stack. How does having the tryCatch() handlers active interact with the longjmp unwind protection here?

@lionel-
Copy link
Contributor Author

lionel- commented Dec 15, 2017

what do you think about making Rcpp_eval() and Rcpp_fast_eval() behave the same

One potential problem is that Rcpp::condition is not internal so there's probably existing code that tries to handle it at C++ level. We could assess the feasibility of this by changing the type of the thrown condition to Rcpp::internal::condition in a branch and running the revdeps on it.

How does having the tryCatch() handlers active interact with the longjmp unwind protection here?

There is no interaction for errors and interrupts because the tryCatch prevents the R code from jumping in these cases. The stack is still unwound with the existing mechanism (even if the error was a normally returned condition and not the result of catching an actual error/interrupt jump). All other longjump causes are detected and a Rcpp::internal::LongjumpException is thrown.

@kevinushey
Copy link
Contributor

One potential problem is that Rcpp::condition is not internal so there's probably existing code that tries to handle it at C++ level. We could assess the feasibility of this by changing the type of the thrown condition to Rcpp::internal::condition in a branch and running the revdeps on it.

Fair enough. We might want to give that a shot later (in a separate PR).

There is no interaction for errors and interrupts because the tryCatch prevents the R code from jumping in these cases. The stack is still unwound with the existing mechanism (even if the error was a normally returned condition and not the result of catching an actual error/interrupt jump). All other longjump causes are detected and a Rcpp::internal::LongjumpException is thrown.

Okay, awesome! Thanks for clarifying.

@lionel-
Copy link
Contributor Author

lionel- commented Dec 15, 2017

By the way, if I'm not mistaken we could replace the evalq() with the quote() primitive and evaluate the tryCatch in the user-supplied environment (with the tryCatch() closure inlined in the expression in case the environment does not inherit from base or masks base::tryCatch()).

This should improve performance a bit and give Rcpp_eval() the semantics of Rf_eval() (which matters for stack-sensitive expressions like return(), sys.xxx(), etc). Code that assumes a specific call stack trace would have to be adapted though.

Edit: Facilitating long returns in this way would cause new kinds of leaks until the safe API is enabled.

@kevinushey
Copy link
Contributor

By the way, if I'm not mistaken we could replace the evalq() with the quote() primitive and evaluate the tryCatch in the user-supplied environment (with the tryCatch() closure inlined in the expression in case the environment does not inherit from base or masks base::tryCatch()).

This should improve performance a bit and give Rcpp_eval() the semantics of Rf_eval() (which matters for stack-sensitive expressions like return(), sys.xxx(), etc). Code that assumes a specific call stack trace would have to be adapted though.

Unfortunately, I'm fairly sure this is code of that form out there already in R packages so we might not be able to make that change, unless we had a convincing example of commonly-written R/Rcpp code that does the wrong thing in the current scheme.

@lionel-
Copy link
Contributor Author

lionel- commented Dec 15, 2017

Ok, so it seems that merging Rcpp_fast_eval() and Rcpp_eval() is not feasible as it would have the same problem by cleaning the call stack trace even further.

@kevinushey
Copy link
Contributor

Fair enough. Perhaps we can just document for new users that Rcpp_fast_eval() is preferred for R packages targeting R >= 3.5.0.

@lionel-
Copy link
Contributor Author

lionel- commented Dec 15, 2017

Yup, probably best to wait until R 3.5 is out and the protected API enabled by default in Rcpp.

@eddelbuettel
Copy link
Member

(Mostly just nodding along...)

@eddelbuettel eddelbuettel merged commit b9a73ba into RcppCore:master Dec 16, 2017
@eddelbuettel
Copy link
Member

eddelbuettel commented Jan 14, 2018

Hi @lionel quick question: was some of the material in the PR activated for stack unwinding?

With r-devel changing so much I had not pushed to win-builder in some time, and this evening I just got an error ending on

Executing test function test.stackUnwinds  ... 
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
terminate called after throwing an instance of 'Rcpp::internal::LongjumpException'

on 32 bit mode on r-devel. Full link is here for the usual time that win-builder keeps it. Could you take a look and see if it may be related?

@lionel-
Copy link
Contributor Author

lionel- commented Jan 14, 2018

was some of the material in the PR activated for stack unwinding?

Only in the test.stackUnwinds unit test. From the message it seems like the LongjumpException is not caught by the wrapping macros. I just tried on R-hub (windows-x86_64-devel) and it seems to work properly (some unit tests fail but not due to exception catching) https://artifacts.r-hub.io/Rcpp_0.12.14.5.tar.gz-ca8f8d1eaf454fdea479be9b68bbeec4/Rcpp.Rcheck/00check.log

I'm not sure how the platform can impact the wrapping macros or exception catching. I am on vacations for one week and then I'll be busy preparing a webinar, so I won't have time to look into this further until January 25th.

One easy way around this is to disable the test for now. It'd also be better to extract the test in its own file so the protection stuff is not enabled for the other tests in runit.misc.R / misc.cpp. I'll send a PR to that effect.

@eddelbuettel
Copy link
Member

eddelbuettel commented Jan 14, 2018

Ok, noted on the timing--bad luck, and my fault for not trying windows r-devel earlier, despite the current changes.

As it is about release time for Rcpp, I will indeed just condition out the test. As it works on Linux, I will only run it there

Don't send a PR just to isolate one test into another, that's not worth it. Enjoy the vacation... Once we work on re-activating this code with R 3.5.0 we can move tests into their file.

PS I had done a bisection and also uploaded Rcpp from just before your patches, which passes.

@eddelbuettel
Copy link
Member

Turns out there may be more trouble.

Your test is inactive, but I still have trouble with windows tests on either i386 or x86_64. Very, very weird. Still in the middle of it.

@eddelbuettel eddelbuettel mentioned this pull request Jan 14, 2018
@krlmlr
Copy link
Contributor

krlmlr commented Jan 15, 2018

I have sent a few instances of b52f690 to win-builder, will post results.

@eddelbuettel
Copy link
Member

I am currently doing that too. Sucketh big time because each iteration is 30+ minutes.

The worst is that I have two complete sets of what is now HEAD which failed once and passed once. The same tarball ...

@eddelbuettel
Copy link
Member

More current thread also over there at #801.

@krlmlr
Copy link
Contributor

krlmlr commented Jan 15, 2018

Not if you send with different versions ;-) And devtools::build_win() is a big time-saver for me.

@eddelbuettel
Copy link
Member

As I said over there, the same tarball got both a fail and a pass at two different points in time.

What does build_win() do? I have a two-line shell script (well, plus arg checking and help) that uses ncftp to copy a tarball in batch to both instances at win-builder.

@krlmlr
Copy link
Contributor

krlmlr commented Jan 15, 2018

build_win() probably does the same as your shell script.

I've sent five tarballs that are identical apart from the version number. Enough for a majority vote ;-)

@eddelbuettel
Copy link
Member

Good experimental design :)

@kevinushey
Copy link
Contributor

I e-mailed Luke and his response was effectively "ask @lionel-" ... so color me frustrated as well.

I think we should remove the use of R_UnwindProtect() altogether for this release until we hear back from @lionel- when he gets back from vacation.

@eddelbuettel
Copy link
Member

I am about done editing this locally. Let me test it once or twice here, then push the branch.

I basically left all he added, but #ifdef-ed it away. This should work. Unless I tripped myself up.

@eddelbuettel
Copy link
Member

Ok, new branch is here and I would definitely appreciate another few sets of eyes.

"Works here", submitted to win-builder and r-hub.

@lionel-
Copy link
Contributor Author

lionel- commented Jan 15, 2018

Works fine on macOS, works fine on 64bit R on Windows, crashes on 32bit R on Windows as the exception is not caught. @lionel- does this seem like a minimal enough example that we might want to let some members of R Core know?

Interesting. Maybe we should look at this together at the work week?

In any case this doesn't explain the trouble caused by this branch because I'm pretty sure that R_UnwindProtect() was not called at all (after the tests were conditioned out).

Thanks for all your work sorting this out @eddelbuettel.

@eddelbuettel
Copy link
Member

No worries. Your patch meant to do the right thing, and we'll get there. I just had to pay extra today and yesterday because I don't usually run CI on Windows, and had not submitted "occassionally" as I usually do as R-devel changed to much of late. Lesson learned.

eddelbuettel added a commit that referenced this pull request Jan 16, 2018
shielding #789 behind a new #define to effectively disable it now
@lionel-
Copy link
Contributor Author

lionel- commented Jan 16, 2018

FWIW I have sent Rcpp 0.12.14 to win-builder and got a timeout while rebuilding the vignettes: https://win-builder.r-project.org/112Ax81s7Dib/00check.log

Also R-hub times out in the tests with that same version. This seems to rule out this PR as the cause of undeterministic failures.

@eddelbuettel
Copy link
Member

eddelbuettel commented Jan 16, 2018

No, that is precisely the issue. One of the vignettes runs all the unit tests and the fact that it doesn't come back absolutely is related to this issue. If you feel like it, try 0.12.13, or try what it is now in master and waiting at CRAN as 0.12.15.

R-hub is likely a different issue indeed.

@eddelbuettel
Copy link
Member

If you care, this is what I got this morning before submitting this to CRAN as 0.12.15:

r-devel

From: Uwe.Ligges@R-Project.org
To: edd@debian.org
Cc: Uwe.Ligges@R-Project.org
Subject: winbuilder: Package Rcpp_0.12.15.tar.gz has been checked and built
Date: Tue, 16 Jan 2018 13:35:30 +0100

Dear package maintainer,
 
this notification has been generated automatically.
Your package Rcpp_0.12.15.tar.gz has been built (if working) and checked for Windows.
Please check the log files and (if working) the binary package at:
https://win-builder.r-project.org/w4r8LSs3H6eY
The files will be removed after roughly 72 hours.
Installation time in seconds: 89
Check time in seconds: 717
Status: 4 NOTEs
R Under development (unstable) (2018-01-15 r74124)
 
All the best,
Uwe Ligges
(CRAN maintainer of binary packages for Windows)

r-release


From: Uwe.Ligges@R-Project.org
To: edd@debian.org
Cc: Uwe.Ligges@R-Project.org
Subject: winbuilder: Package Rcpp_0.12.15.tar.gz has been checked and built
Date: Tue, 16 Jan 2018 13:37:07 +0100

Dear package maintainer,
 
this notification has been generated automatically.
Your package Rcpp_0.12.15.tar.gz has been built (if working) and checked for Windows.
Please check the log files and (if working) the binary package at:
https://win-builder.r-project.org/zGdIH49Vy33X
The files will be removed after roughly 72 hours.
Installation time in seconds: 85
Check time in seconds: 720
Status: 5 NOTEs
R version 3.4.3 (2017-11-30)
 
All the best,
Uwe Ligges
(CRAN maintainer of binary packages for Windows)

@lionel-
Copy link
Contributor Author

lionel- commented Jan 16, 2018

One of the vignettes runs all the unit tests and the fact that it doesn't come back absolutely is related to this issue.

But the version I sent was anterior to this PR. Maybe the 4th version component causes too many tests to be run while building the vignette?

@eddelbuettel
Copy link
Member

Here are my logs from yesterday. Note that r-hub essentially always breaks. The key is that

  • before your PR winbuilder passes on both
  • after your PR r-rel passes, r-dev fails
  • this was a bit a dump of conciousness, I did not run all in order or write entries in order
  • we did conclude that win-builder also had some flakyness (but cf @krlmlr getting "five for five")
  • we all concluded that the PR was not good at this time (cf @kevinushey comments)
- 0.12.14.1 just after 0.12.14
    - now in ~/git/rcpp-is-fucked-up/000_release_0.12.14/
    - sent to win-builder @ 10:32
    - 40-ish mins later PASS on r-devel and r-release
        
- 0.12.14.3 was pivoted just before Lionel's pr789
    - now checked out in ~/git/rcpp-is-fucked-up/010_before_789_after_785/Rcpp
    - PASSed win builder on 01/14
    - times out at r-hub, indicating r-hub may not be quite right

- 0.12.14.3.1 (and 0.12.14.35) was just after Lionel's pr789 
    - branch 'local/before790' at 1630a55b
    - added skip of Lionel's test
    - http://builder.r-hub.io/status/Rcpp_0.12.14.3.1.tar.gz-c3812d0bcbcd430c8f691c6abb3a3191
    - times out at rhub
    - built again at 1630a55b in ~/git/rcpp-is-fucked-up/020_after_789_before_790
    - calling this 0.12.14.35
    - sent to win-builder @ 11:28
    - 30+ mins later PASS r-release
    - 60+ mins later FAIL r-devel (note that we had NOT remove his tests)

- 0.12.14.4 was just after #790 by Kevin
    - added skip of Lionel's test
    - http://builder.r-hub.io/status/Rcpp_0.12.14.4.tar.gz-614f7dd1ed444823bb5aedf5404d44e9
    - times out at rhub 
    
- 0.12.14.7 was another more minimalistic attempt "from the current end" without unwinw
    - times out at still at rhub
        
- 0.12.14.7.1 was `git checkout -b local/beforePR797`
    - times out at rhub
      
- 0.12.14.8 ("current" as of 01/14 and 01/15)
    - failed win-builder (r-rel,r-dev) on 01/14
    - PASSed win-builder (r-rel,r-dev) on 01/15

- 0.12.14.8 (with new #ifdef)
    - sent to win-builder at 14:40 or so
    - on r-hub three WARNING (normal) and one ERROR (with UBSAN, no actual error, also normal)

@lionel-
Copy link
Contributor Author

lionel- commented Jan 16, 2018

we all concluded that the PR was not good at this time

Agreed. But we still need to figure out whether the PR really is the cause of the nondeterministic failures. Aren't you concerned that the failure also occurs with the previous CRAN release?

@eddelbuettel
Copy link
Member

eddelbuettel commented Jan 16, 2018

Aren't you concerned that the failure also occurs with the previous CRAN release?

Well, it doesn't. First entry:

- 0.12.14.1 just after 0.12.14
    - now in ~/git/rcpp-is-fucked-up/000_release_0.12.14/
    - sent to win-builder @ 10:32
    - 40-ish mins later PASS on r-devel and r-release

PASS on r-devel and r-release is where we started, with a snapshot just after 0.12.14.

I do not think that what you and I are doing right now is productive use of my time, or your vacation. We all had determined that the issue was with #789. You can fix it, or throw your arms in the air. But please do not make me re-visit every aspect now. We all missed that this comes up a lot on Windows, but also see @kevinushey above (or maybe in the #801 thread) stating that it also took his RStudio on OS X down though not each time.

Maybe we should just wait til this stabilizes in R-devel.

lionel- added a commit to lionel-/Rcpp that referenced this pull request May 30, 2018
lionel- added a commit to lionel-/Rcpp that referenced this pull request May 30, 2018
lionel- added a commit to lionel-/Rcpp that referenced this pull request May 30, 2018
lionel- added a commit to lionel-/Rcpp that referenced this pull request Jun 1, 2018
@lionel- lionel- mentioned this pull request Jun 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants