Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching from Rcpp_eval to Rcpp_fast_eval. close #866 #867

Merged
merged 3 commits into from
Jun 17, 2018

Conversation

thirdwing
Copy link
Member

@Enchufa2

This PR ports most of the Rcpp_eval to Rcpp_fast_eval.

I might need a little more time to double check.

@codecov-io
Copy link

codecov-io commented Jun 10, 2018

Codecov Report

Merging #867 into master will increase coverage by <.01%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #867      +/-   ##
==========================================
+ Coverage   90.21%   90.22%   +<.01%     
==========================================
  Files          70       70              
  Lines        3261     3263       +2     
==========================================
+ Hits         2942     2944       +2     
  Misses        319      319
Impacted Files Coverage Δ
inst/include/Rcpp/r_cast.h 100% <ø> (ø) ⬆️
inst/include/Rcpp/vector/Vector.h 86.71% <ø> (ø) ⬆️
inst/include/Rcpp/Module.h 94.11% <ø> (ø) ⬆️
inst/include/Rcpp/exceptions.h 7.69% <ø> (ø) ⬆️
inst/include/Rcpp/Function.h 80% <ø> (ø) ⬆️
inst/include/Rcpp/proxy/NamesProxy.h 75% <0%> (ø) ⬆️
src/barrier.cpp 64.77% <100%> (ø) ⬆️
inst/include/Rcpp/generated/Function__operator.h 100% <100%> (ø) ⬆️
inst/include/Rcpp/Environment.h 100% <100%> (ø) ⬆️
inst/include/Rcpp/api/meat/Rcpp_eval.h 66.66% <0%> (+3.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d5cdb8...9f7760f. Read the comment docs.

@eddelbuettel
Copy link
Member

Passing Travis is an excellent first step :)

@coatless
Copy link
Contributor

Looks good. Though, the switch over is missing in src/ barrier:

Rcpp/src/barrier.cpp

Lines 30 to 32 in 5d5cdb8

namespace Rcpp { SEXP Rcpp_eval(SEXP, SEXP); }

Rcpp/src/barrier.cpp

Lines 99 to 101 in 5d5cdb8

Rcpp::Shield<SEXP> call(Rf_lang2(getNamespaceSym, RcppString));
Rcpp::Shield<SEXP> RCPP(Rcpp_eval(call, R_GlobalEnv));

Rcpp/src/barrier.cpp

Lines 141 to 143 in 5d5cdb8

Rcpp::Shield<SEXP> call(Rf_lang2(getNamespaceSym, RcppString));
Rcpp::Shield<SEXP> RCPP(Rcpp_eval(call, R_GlobalEnv));
Rcpp::Shield<SEXP> cache(Rf_allocVector(VECSXP, RCPP_CACHE_SIZE));

@Enchufa2
Copy link
Member

@thirdwing Why did you switch back to Rcpp_eval in src/barrier.cpp in the second commit?

@thirdwing
Copy link
Member Author

@Enchufa2 I have some linking errors on one machine after changing barrier.cpp, so I said I need a little more time.

@eddelbuettel
Copy link
Member

No issues in rev.dep, results committed in rcpp-logs repo under RcppCore/rcpp-logs@8d355ce

Copy link
Member

@eddelbuettel eddelbuettel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Textual" substitution is very clean; rev. deps are fine; happy to merge this. We can look into the remaining file / other issues in a follow-up PR as needed

@kevinushey
Copy link
Contributor

This PR looks good to me as well. @lionel- any reservations around using Rcpp_fast_eval() by default internally in Rcpp? (Do you think we should be using it by default with R 3.5.x as well?)

@lionel-
Copy link
Contributor

lionel- commented Jun 11, 2018

If RCPP_PROTECTED_EVAL is unset (or if it is set but R version is < 3.5) Rcpp_fast_eval() devolves to Rcpp_eval() so this should be safe to merge.

@thirdwing Have you tested this with protect-unwind enabled? You can enable it by adding #define RCPP_PROTECTED_EVAL on top of Rcpp.h.

Do you think we should be using it by default with R 3.5.x as well?

I was thinking we should first get a Rcpp release with protect-unwind disabled by default, then enable it in our packages to get some user exposure, and only then turn it on by default in Rcpp.

@eddelbuettel
Copy link
Member

I will once again run a second rev.dep with the #define on, probably overnight.

Agreed that this should be safe to merge as is.

@lionel-
Copy link
Contributor

lionel- commented Jun 11, 2018

I will once again run a second rev.dep with the #define on, probably overnight.

Will this be on R 3.5? If R 3.4 the define will not have an effect.

Agreed that this should be safe to merge as is.

On second thought this might cause trouble. Switching to fast eval means errors will be rethrown as is instead of being wrapped in a Rcpp::eval_error. Also the error message will not start with "Evaluation error: ". Since users tend to check for errors based on the error message (and more rarely on the error class), this might cause revdep issues.

I think it's the right move to eventually switch to fast eval, but it might not be as smooth a transition as we could hope for. So it might be more prudent to first enable protect-unwind and only then switch to fast eval throughout Rcpp, so we can more easily assess the impact of enabling protect-unwind on revdep checks.

@eddelbuettel
Copy link
Member

If R 3.4 the define will not have an effect.

Crap. Time to bite the bullet I suppose and rebuild the rev.dep checker.

@Enchufa2
Copy link
Member

Enchufa2 commented Jun 11, 2018

With R 3.5.0 and RCPP_PROTECTED_EVAL defined, I see

undefined symbol Rcpp::Rcpp_eval(SEXPREC*, SEXPREC*)

when installing Rcpp. If I substitute Rcpp_eval calls in src/barrier.cpp with Rcpp_fast_eval calls, the issue goes away.

Edit: I managed to borrow an idle machine, although not very powerful. Revdep checking now with the changes above. It might take a while.

@eddelbuettel
Copy link
Member

R 3.5.0 without the #define passed just fine.

I just started R 3.5.0 with the #define (and the fix barrier.cpp).

@thirdwing
Copy link
Member Author

I think the linking problem was introduced by me in a previous PR.

The simplest workaround is to use Rf_eval in barrier.cpp.

@eddelbuettel
Copy link
Member

Using Rf_eval sounds like a good plan there.

In barrier.cpp, what we evaluate is 'getNamespace("Rcpp")'.
Of course, it is better to call this in a try-catch manner.
This will cause some potential linking problem.
@thirdwing
Copy link
Member Author

@Enchufa2 Can you try the code again with RCPP_PROTECTED_EVAL defined?

@Enchufa2
Copy link
Member

Enchufa2 commented Jun 13, 2018 via email

@eddelbuettel
Copy link
Member

All "mostly" good here too with the

  • R 3.5.0
  • #define for RCPP_PROTECTED_EVAL
  • barrier.cpp edited to use Rcpp_fast_eval (didn't think of Rf_eval then)
    and everything passes but I got errors on reticulate, tidyxl and v8. Those matter, so I will check again
    later with Rf_eval.

We could still merge as RCPP_PROTECTED_EVAL is not generally on.

@eddelbuettel
Copy link
Member

eddelbuettel commented Jun 13, 2018

Regression alert: packages

  • reticulate
  • tidyxl
  • V8

all fail their tests with this version of the PR (ie Rf_eval in src/barrier.cpp) as well as with the one from one commit back (using `Rcpp_fast_eval). Any chance you could examine this, @lionel- ?

@lionel-
Copy link
Contributor

lionel- commented Jun 13, 2018

Happy to, could you link to or send me the logs please?

Did you run the checks on master or on this branch?

@eddelbuettel
Copy link
Member

This branch, once without the define, once with. That second run had an imperfect barrier.cpp but I ran the three failures again with the branch as it is now.

Not my machine, so not easy for access. I will tar them up and put them on my webserver. Good enough? I'll DM you the URL.

@Enchufa2
Copy link
Member

Regression alert: packages

Same here.

@eddelbuettel
Copy link
Member

I am leaning towards merging as is, despite the three known regressions as it is still opt-in with the #define.

Any thoughts?

@thirdwing
Copy link
Member Author

I am looking into the checking error in reticulate. Please give me one more day.

@@ -97,7 +95,7 @@ SEXP get_rcpp_cache() {
SEXP getNamespaceSym = Rf_install("getNamespace"); // cannot be gc()'ed once in symbol table
Rcpp::Shield<SEXP> RcppString(Rf_mkString("Rcpp"));
Rcpp::Shield<SEXP> call(Rf_lang2(getNamespaceSym, RcppString));
Rcpp::Shield<SEXP> RCPP(Rcpp_eval(call, R_GlobalEnv));
Rcpp::Shield<SEXP> RCPP(Rf_eval(call, R_GlobalEnv));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be evaluated in base or in the Rcpp namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most (all?) of these Rcpp_eval() / Rcpp_fast_eval() actually :)

Probably best to make these changes in another PR. And merge / check that PR before this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or modify this one and test it again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking a different PR would make things easier should this one be reversed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. On the other hand this one "does no harm" / has no effect until the #define. So we could merge, merge the two smaller ones behind it and have a new baseline to work from. We surely have different paths forward...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can send another PR and I will reverse this commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? This PR is three commits. What do you suggest reversing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the changes in barrier.cpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do. And test that again, or first against those three? And wasn't there a link issue?

@lionel-
Copy link
Contributor

lionel- commented Jun 14, 2018

I can reproduce the failures. tidyxl is ok (it's failing because it's checking the error string). Working on figuring out the other two between meetings (it's the RStudio work week).

I don't think we should merge this PR right now because master doesn't have the failures, so there may be something wrong here.

@lionel-
Copy link
Contributor

lionel- commented Jun 14, 2018

The V8 issue is because Rcpp code is run in callbacks: https://github.com/jeroen/V8/blob/c67abd98797e488adc941ce67097b06e6547f6dc/src/V8.cpp#L63

@jeroen I think it would be safer not to use Rcpp in such callbacks and use protect-unwind manually.

Going to investigate the reticulate error soon.

@thirdwing
Copy link
Member Author

@lionel- The error in reticulate should come from the changes in inst/include/Rcpp/generated/Function__operator.h

@jjallaire
Copy link
Member

jjallaire commented Jun 14, 2018 via email

@thirdwing
Copy link
Member Author

@lionel- I think the error in reticulate is because Rcpp_fast_eval never throw.

@lionel-
Copy link
Contributor

lionel- commented Jun 15, 2018

@thirdwing Rcpp_fast_eval() does throw, this is how it gets to unwind the C++ stack.

When sourcing a Python script, the getter defined here https://github.com/rstudio/reticulate/blob/43664503884625f55cd4d452228a7a855ce86ce2/R/python.R#L1323 is called from the try-catch block linked by @jjallaire. The getter parses strings as R code but it is passed two non-syntactic values __call__ and __iter__:

// rArgs:
[[1]]
<__main__.R>

[[2]]
__call__

Error in parse(text = as_r_value(code)) : <text>:1:1: unexpected input
1: _
    ^

// rArgs:
[[1]]
<__main__.R>

[[2]]
__iter__

Error in parse(text = as_r_value(code)) : <text>:1:1: unexpected input
1: _
    ^

So Function doCall("do.call") throws an R error and then a C++ error. With Rcpp_eval() the R error is caught by the R tryCatch() and an exception inheriting from std::exception is caught by the C++ try/catch. With Rcpp_fast_eval() the error is not caught on the R side but protect-unwind kicks in and throws a C++ LongjumpException. Since the latter does not inherit from std::exception, it hits the catch-all block. However this difference doesn't matter here because it only impacts the Python error string of errorDict.

The reason we observe a different behaviour is because if you don't catch the error on the R side, it gets printed before initiating the error longjump. Hence catching a longjump from the C++ side without catching it on the R side will always produce the side effect of printing the error message as part of the unwinding.

I think in general these longjumps should not be interrupted. The good practice would be to catch the LongjumpException (which is currently namespaced in Rcpp::internal, we might want to make it public) and rethrow it immediately. For the reticulate error it might be harder to rethrow since the libpython stack is on the way. Maybe we should add a getter for the SEXP token which would be easier to pass around, and make public a constructor for a LongjumpException that'd take the token as argument. This way you can easily pass the token in the C contexts and resume the longjump once back in Rcpp context (or maybe use the R API to resume the longjump in you're no longer in Rcpp context and it is safe to do so).

For the cases where you really want to interrupt the longjump, this gets trickier if Rcpp switches to fast_eval everywhere. But we could provide some kind of Rcpp_catch_unwind() function that takes a C/C++ function pointer that would be called from an R tryCatch() context. This way we pay the tryCatch() overhead only once and still get to properly catch the R errors.

@lionel-
Copy link
Contributor

lionel- commented Jun 15, 2018

The good news is that it appears fast_eval is doing the right thing in all cases :)

@lionel-
Copy link
Contributor

lionel- commented Jun 15, 2018

Oh and the last piece of the puzzle is that unhandled errors are recorded by testthat in a calling handler. So even though the code path eventually succeeds the error is detected and reported to R CMD check. This is another reason not to handle errors on the C++ side without handling them on the R side as well. And either resume the jump or evaluate Rcpp code in a Rcpp_catch_unwind() context as argued above. This way the R and C++ semantics are aligned.

@eddelbuettel
Copy link
Member

So what is the consensus then for next steps? Merge this, have a follow-up PR?

@lionel-
Copy link
Contributor

lionel- commented Jun 17, 2018

I think this can be merged and I'll send a follow-up PR with an API to deal with handling of R errors at the C++ level. I'll also look into the eval envs.

@eddelbuettel eddelbuettel merged commit 300d000 into RcppCore:master Jun 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants