Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register native routines automatically within compileAttributes #694

Merged
merged 5 commits into from
May 18, 2017

Conversation

jjallaire
Copy link
Member

There are two main components here:

  1. Use the metadata that we already have within compileAttributes to automatically emit a package init function that registers all exported C++ functions with R via R_registerRoutines.

    Note that users have been doing this manually for several months and some packages may already have an init function for other reasons. Consequently, we only do automatic registration if no existing package init function is found.

  2. Registering native functions and calling useDynLib(foo, .registration = TRUE) is actually not sufficient to have R wrappers bind to the registered routines. To do that we actually need to generate this code:

    .Call(reticulate_py_is_none, x)

    Rather than this code:

    .Call('reticulate_py_is_none', PACKAGE = 'reticulate', x)

    Note that we only want to generate the former code when the user has included .registration = TRUE (otherwise the routines won't be present in the package namespace). As a result, we scan the NAMESPACE file for useDynLib(pkgname, .registration = TRUE) and only generate R wrappers of that form if it's found.

Relevant documentation from Writing R Extensions is here: https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Registering-native-routines

Of particular interest is the bit about R_useDynamicSymbols(info, FALSE). My reading of this is that by calling this you are saying that it still okay to to use this syntax:

.Call('reticulate_py_is_none', PACKAGE = 'reticulate', x)

However, only if the symbol has been registered. This is important for us because we are now going to always pass FALSE to R_useDynamicSymbols, and we don't want that to break packages that in turn neglect to pass .registration = TRUE to useDynLib. Since we register all the Rcpp exported functions they will work using either the character based or symbol based .Call lookup.

This branch of the reticulate package demonstrates the change in code generation that occurs:

https://github.com/rstudio/reticulate/compare/feature/register-native-routines

Note that we've changed our @useDynLib roxygen to:

#' @useDynLib reticulate, .registration=TRUE

And that we've removed entirely the init.cpp file previously provided as a stub to pass CRAN tests.

@codecov-io
Copy link

codecov-io commented May 16, 2017

Codecov Report

Merging #694 into master will decrease coverage by 0.2%.
The diff coverage is 83.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #694      +/-   ##
==========================================
- Coverage   89.72%   89.52%   -0.21%     
==========================================
  Files          66       66              
  Lines        3523     3588      +65     
==========================================
+ Hits         3161     3212      +51     
- Misses        362      376      +14
Impacted Files Coverage Δ
R/Attributes.R 97.61% <77.14%> (-1.69%) ⬇️
src/attributes.cpp 98.4% <86.66%> (-0.51%) ⬇️
R/Rcpp.package.skeleton.R 83.92% <88.88%> (+0.19%) ⬆️
R/loadModule.R 98.71% <0%> (-1.29%) ⬇️
inst/include/Rcpp/vector/proxy.h 94.73% <0%> (+0.61%) ⬆️
R/RcppClass.R 78.16% <0%> (+1.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e0f79c...948551c. Read the comment docs.

@kevinushey
Copy link
Contributor

LGTM!

If I understand correctly, this means users can implicitly opt-out of Rcpp native routine registration just by making sure they provide their own call to R_init_<pkg> somewhere; is that right?

@jjallaire
Copy link
Member Author

jjallaire commented May 16, 2017 via email

@kevinushey
Copy link
Contributor

That sounds like the right behavior to me as well. FWIW from what I understand, the R_useDynamicSymbols(dllInfo, FALSE) bit implies the following:

.Call("routine", PACKAGE = "package") # okay
.Call(routine)                        # okay given R native routine object called 'routine' 
.Call("routine")                      # fail: won't discover routine

I think in theory the 'old' way we constructed Rcpp exports would still work fine -- it's only when users fail to provide the PACKAGE = argument will lookup fail. (Of course it's better for us to use the routine wrapper objects directly)

@jjallaire
Copy link
Member Author

jjallaire commented May 16, 2017 via email

@eddelbuettel
Copy link
Member

eddelbuettel commented May 16, 2017

I am not entirely sure I agree with all of this. For months now, I have been doing two things:

  • add native registration by using R-devel or R 3.4.0
  • add .registration=TRUE in NAMESPACE

but nothing else. In particular, I did not bother to alter all the "quoted" .Call() uses. I also never overwrote the default I on the boolean in R_useDynamicSymbols(). Grepping for this now it seems like all my C++ packages use FALSE and the few plain C packages (digest, RApiDatetime, RApiSerialize) use TRUE.

I have the suspicion I am still missing something here.

@jjallaire
Copy link
Member Author

Yes, that's all correct. You were setting the table properly to use native symbols however compileAttributes wasn't generating code that actually used the symbols. e.g. in my reticulate example as a result of registering the symbols the following was sitting in the reticulate namespace:

reticulate:::reticulate_py_is_none

However, when calling "py_is_none" the string based version of lookup rather than the symbol based lookup was being used.

Net: all of that registration was really just meeting the requirements of CRAN checks but not actually put into use by the package.

The FALSE setting is recommended in Writing R Extensions as a good sanitary practice. There is a further call R_forceSymbols(dll, TRUE) you can make (also documented there) which would actually require the use of the symbols and prevent any character based lookup. We don't want to generate that code though because it would break people who haven't bother to do useDynLib(pkgname, .registration = TRUE)

@eddelbuettel
Copy link
Member

Net: all of that registration was really just meeting the requirements of CRAN checks but not actually put into use by the package.

I think it is not quite as dire. On one package (anytime, use Rcpp, has FALSE) I hit a "unpleasant many-legged creature" when altering a function signature, running (older) compileAtttributes() and still couldn't -- as init.c needed an update for the change number of arguments. So some checking was done by R.

But I think I agree: nicer calls, with the symbol rather than string, are the real goal. Getting there.

This is a huge help. Running an RcppArmadillo test now; wonder how to best test this. Probably need to rebuild a few packages with / without the existing init.c.

@jjallaire
Copy link
Member Author

jjallaire commented May 16, 2017 via email

@eddelbuettel
Copy link
Member

The performance aspect is nice but the delta seems a little marginal from my reading of WRE.

Now, I will admit that I got totally confused a few weeks ago over one aspect I couldn't quite remember the details: we added Rcpp::interface() a while back, but did we also add the 'can it be called' testing layer? I had forgotten about that.

And should we not now push harder for Rcpp::interface() ?

@eddelbuettel
Copy link
Member

And/or do you know a package using Rcpp::interface() to provide C(++) code for another package?

@jjallaire
Copy link
Member Author

jjallaire commented May 16, 2017 via email

@eddelbuettel
Copy link
Member

Yeah. I have it in RQuantLib, and "more code got generated" (that I sort-of had forgotten about) but I also do not know of a user package ... (And we now have header-only subset of QuantLib in Quantuccia so it is even more moot.)

@eddelbuettel
Copy link
Member

First real test: "boom". In anytime, I have src/init.c but Rcpp created registration in src/RcppExports.cpp too:

g++ -Wl,-S -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o anytime.so RcppExports.o anytime.o init.o -L/usr/lib/R/lib -lR
init.o: In function `R_init_anytime':
/home/edd/git/anytime/src/init.c:36: multiple definition of `R_init_anytime'
RcppExports.o:/home/edd/git/anytime/src/RcppExports.cpp:136: first defined here
collect2: error: ld returned 1 exit status
/usr/share/R/share/make/shlib.mk:6: recipe for target 'anytime.so' failed
make: *** [anytime.so] Error 1
ERROR: compilation failed for package ‘anytime’
* removing ‘/usr/local/lib/R/site-library/anytime’
* restoring previous ‘/usr/local/lib/R/site-library/anytime’
edd@max:~/git/anytime(feature/numeric_input_change)$ 

Quick guess: Maybe because patch currently does not scan C files?

Necessary to discover R_init_pkgname functions defined within .c files (e.g. init.c)
@jjallaire
Copy link
Member Author

jjallaire commented May 17, 2017 via email

@eddelbuettel
Copy link
Member

eddelbuettel commented May 17, 2017

@jjallaire Actual bug. Using my small-ish / recent-ish RcppQuantuccia package, I removed src/init.c and called compileAttribute(). The RcppExports where updated, but:

~/git/rcppquantuccia(master)$ R CMD INSTALL .
* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘RcppQuantuccia’ ...
** libs
g++ -std=gnu++11 -I/usr/share/R/include -DNDEBUG -I../inst/include/ -I. -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"    -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++11 -I/usr/share/R/include -DNDEBUG -I../inst/include/ -I. -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"    -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c dates.cpp -o dates.o
g++ -std=gnu++11 -I/usr/share/R/include -DNDEBUG -I../inst/include/ -I. -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/BH/include"    -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c utils.cpp -o utils.o
g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o RcppQuantuccia.so RcppExports.o dates.o utils.o -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/RcppQuantuccia/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error: package or namespace load failed for ‘RcppQuantuccia’ in .doLoadActions(where, attach):
 error in load action .__A__.1 for package RcppQuantuccia: .Call("RcppQuantuccia_RcppExport_registerCCallable", PACKAGE = "RcppQuantuccia"): "RcppQuantuccia_RcppExport_registerCCallable" not available for .Call() for package "RcppQuantuccia"
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/usr/local/lib/R/site-library/RcppQuantuccia’
* restoring previous ‘/usr/local/lib/R/site-library/RcppQuantuccia’
~/git/rcppquantuccia(master)$ 

Edit 1: The R/RcppExports.R has this at the end:

# Register entry points for exported C++ functions
methods::setLoadAction(function(ns) {
    .Call('RcppQuantuccia_RcppExport_registerCCallable', PACKAGE = 'RcppQuantuccia')
})

If I comment that out, all is good.

Edit 2: And it is caused by the Rcpp::interfaces(r, cpp) I had carried over. If I undo that, we're good.

@jjallaire
Copy link
Member Author

jjallaire commented May 17, 2017 via email

@jjallaire
Copy link
Member Author

Just pushed a fix (and tested it with RcppQuantuccia).

There's one more lingering issue: if users of attributes are combining that with exporting C/C++ functions without attributes then our current code generation won't pick that up. I don't know if that scenario exists as a practical matter but it seems like somewhere in the wild it must. We have a couple different options here:

  1. Under R 3.4 actually call the tools exported function to do the code generation (but still include it within RcppExports.cpp). That would pickup all of the exported functions not just the ones exported using attributes.

  2. Say that this is a special case and if users have standalone old-style exports they should write their own package init function (however we'd probably only get the chance to communicate this after they regenerate and notice their package no longer loads!)

  3. As an extension to XPtr<void> #2 to prevent any build errors from ever occurring, actually detect additional old-style exports and don't write our own registration in that case (similar to how we don't write registrations when we see a package init function).

I think option (1) vs. option (2+3) are about the same amount of work. I probably favor (1).

@eddelbuettel
Copy link
Member

Thanks for the fix, will test.

I like 1) and was surprised you didn't go calling it directly. Shall we enable that more generally, or 'just' in the case of exported functions?

…e to discover and register additional non-Rcpp exported native routines
@jjallaire
Copy link
Member Author

I implemented (1) so this case should now be covered. The reason I didn't have R do all of the code generation is that it generates different code (assumes it's in a .c rather than .cpp file) and we may need to eventually inject additional code into the init function (e.g. imagine a package author wants our generated registration but also wants to run some code on init, we could scan for the definition of Rcpp_init_pkgname and then call it from R_init_pkgname if it exists). Calling R from this spot within code generation also won't find the definitions in RcppExports.R since they haven't yet been written (i.e. we'd need to always inject them into the generated R code, so easier to just generate our own code in the first place).

@eddelbuettel
Copy link
Member

Ahh. I see now.

This should be good to merge then.

@krlmlr
Copy link
Contributor

krlmlr commented May 31, 2017

Re Rcpp::interfaces: I'm actually using it for plogr (logging) and bindrcpp (creating/populating environments with bindings that call a C++ functiion).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants