Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registration, autogenerated symbols and 'no documentation' false positive #723

Closed
eddelbuettel opened this issue Jun 30, 2017 · 26 comments
Closed

Comments

@eddelbuettel
Copy link
Member

What can we do when .registration=TRUE and we now generate globally-visible identifiers that have to be R and C++ compatible?

Under the default exporting regexp of exportPattern("^[[:alpha:]]+") these are visible, so R nags.

We cannot prefix with dot or underscore because (I think) in either case one language whines. But if we do nothing, R whines. Shall we (auot-)create an Rd (or roxygen) stub?

@jjallaire
Copy link
Member

I don't fully get the scenario so hard to say. What's a concrete example of a function that triggers this error?

I personally don't love the exportPattern("^[[:alpha:]]+") as there are typically more non-exported than exported functions in a package and this forces users into naming functions with dot prefixes. But I realize it's common convention so something we need to deal with.

We could always generate a single catch-all Rd file sort of like roxygen does for re-exports. e.g. https://github.com/rstudio/keras/blob/master/man/reexports.Rd

@eddelbuettel
Copy link
Member Author

I don't fully get the scenario so hard to say. What's a concrete example of a function that triggers this error?

It hit me during the class. If we use the package generators (either via Rcpp*.package.skeleton() or the equivalent generators in RStudio) then we get a wild-carded exportPattern(...). Personally I am fine editing it but for newbies it sucks.

Agreed on the re-exports file. I know how to write those in Rd, I failed to get it done with roxygenize() when I tried earlier today on the train. I guess the .Rd route is the way to go then.

@jjallaire
Copy link
Member

It seems like .registration = TRUE and exportPattern("^[[:alpha:]]+") together is an anti-pattern, as it will result in spamming the search path with a bunch of intermediate generated functions with awkward names (i.e. prefaced with pkgname_).

I wonder if we should move away from generating packages with exportPattern("^[[:alpha:]]+") entirely. I agree that it's trickier for beginners to get that they need to add exports to NAMESPACE, but the alternative that they never learn about NAMESPACE and have to use dot prefacing for all private functions seems just as bad.

@eddelbuettel
Copy link
Member Author

I hear you (and I wholeheartedly agree about the anti-pattern).

But the pattern is what R does by default. I am a little hesitant to move away from it.

I think ultimately the best may be to have two regexps: all matching 'x' yet excluding to cover our per-package pattern. Defensible idea?

But until then and for maybe some time auto-generate help stubs?

@jjallaire
Copy link
Member

jjallaire commented Jul 1, 2017 via email

@jjallaire
Copy link
Member

jjallaire commented Jul 2, 2017 via email

@eddelbuettel
Copy link
Member Author

Nice work. I like this a lot. And possibly a generic issue not just for Rcpp but everything with the same (default) regexp and .registration=TRUE ?

Do you have a moment to send a mail to r-devel? Else I can do it once I settled in a little here.

@eddelbuettel
Copy link
Member Author

Ok, I think I am loosing my marbles. Look for example at this src/init.c file in RcppAnnoy. We use global exports, we have no extra aliases, and we get no warnings from R CMD check --as-cran because the Rcpp Modules functions are exported as _rcpp_modules_*. I think I tried this 'by hand' with another package. Can we not prefix the generated symbols with _rcpp* as well?

@jjallaire
Copy link
Member

jjallaire commented Jul 2, 2017 via email

@jjallaire
Copy link
Member

jjallaire commented Jul 2, 2017 via email

@eddelbuettel
Copy link
Member Author

Sorry for the line noise. I had meant to test just this (ie using underscores) 'by hand' last week but I must have messed something up. I can confirm that this use in RcppAnnoy leads to zero complaints from R 3.4.0 and R-devel (as of a few days ago).

Getting this fix into the July release would be great. We should have time.

@jjallaire
Copy link
Member

jjallaire commented Jul 2, 2017 via email

@eddelbuettel
Copy link
Member Author

Right. That must be what I ran into -- R complains about _. The backtick-underscore solution is the best I can think of.

What other options do we have? Complain to r-devel?

@jjallaire
Copy link
Member

jjallaire commented Jul 2, 2017 via email

@eddelbuettel
Copy link
Member Author

I'd like that idea. I'll talk to Martyn about this in the next few days.

@eddelbuettel
Copy link
Member Author

The best idea seems to be change the exportPattern() to have two arguments where the second, optional one, would allow to specify an exclude pattern. Obviously, someone (us?) would have to write that first.

@jjallaire
Copy link
Member

We can actually already approximate this (albeit awkwardly) via the exportPattern regex (which doesn't use perl = TRUE so can't do lookahead). For example, if we wanted the prefix to be r_nr_ then we could use this export pattern:

exportPattern("^(([^r])|(r[^_])|(r_[^n])|(r_n[^r])|(r_nr[^_]))[[:alpha:]]+")

But of course that wouldn't work with existing instances of exportPattern, and the code we currently have does (because it uses only a _ prefix_. So I'd say if we need to generate new code in NAMESPACE we have what we need for that already (via the regex) and if we don't want to ask users to edit their NAMESPACE then we have a workable solution.

I still think that it can't possibly be intentional that native routine objects (which are not functions and only usable within a .Call) are export by default from R packages generated with package.skeleton. I think an automatic exclusion is warranted, but if that's not going to happen then I think we can be satisfied with either our current workaround or one based on a more elaborate regex in exportPattern.

@eddelbuettel
Copy link
Member Author

I was thinking about tuning the pattern in the regexp, but clearly not hard enough.

@mattfidler
Copy link

Hi, I was wondering about the use of R_forceSymbols. If you use this solution, then R cannot use the R_forceSymbols. Perhaps there should be a readme about this...?

@mattfidler
Copy link

To be more clear, you cannot use R_forceSymbols(dll,TRUE) beacuase R does not understand symbols that begin with _

@eddelbuettel
Copy link
Member Author

eddelbuettel commented Jul 20, 2017

Can you quote-protect / backtick-protect the underscore? Eg from one of my packages:

getDepends <- function(regexp = ".") {
    .Call('_RcppAPT_getDepends', PACKAGE = 'RcppAPT', regexp)
}

and with that I noticed I did not yet have .registration=TRUE there. Once done this becomes

getDepends <- function(regexp = ".") {
    .Call(`_RcppAPT_getDepends`, regexp)
}

@eddelbuettel
Copy link
Member Author

eddelbuettel commented Jul 20, 2017

Never mind, I guess, because R_forceSymbols() would be used in C / C++ code.

Do you have a reproducible example of something breaking / no longer working?

I just read up on in Writing R Extension. Are you suggesting we should add R_forceSymbols(dll, TRUE)? For kicks I just added it to the package used for the illustration here (RcppAPT) and it makes no difference. Installs and tests with and without it. I see no extra benefit, really,

@mattfidler
Copy link

I did not have a reproducable example, but saw that the new Rcpp exports were prefixed by underscore and assumed that R_forceSymbols would not work.

I'm not sure about the benefits of forcing symbols.

Writing R extensions suggests that forcing symbols is better (though I cannot see how either). There is a risk that the CRAN/R team will eventually force what is suggested as optional, like they did with package registration.

It is clear that .Call(_function_name) won't work since _function_name is not a valid R symbol. Upon testing, .Call('_function_name',...) with backquotes works just fine.

So this is an issue of my understanding, (and not seeing the backquote when looking at the updated code).

Does Rcpp now generate these registration symbols? Or you talking about the package skeleton that one of the R utilities does?

I think you should decide if R_forceSymbols(dll, TRUE) is worthwhile. I'm undecided, but since it is suggested, I personally include it.

@eddelbuettel
Copy link
Member Author

There is a fairly extensive discussion in "Writing R Extensions" which should answer all your questions. If you have additional questions, please consider the r-devel or r-package-devel lists as none of this is really specific to Rcpp.

We are simply helping to prepare user code for the now-recommended ways of interfacing R. And there is no issue here -- we went through the required changes (ie the backtick to protect the underscore) for the recent Rcpp 0.12.12 release.

@mattfidler
Copy link

That is fine. Thank you.

@eddelbuettel
Copy link
Member Author

This can be closed now too with Rcpp 0.12.12 on CRAN,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants