New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registration, autogenerated symbols and 'no documentation' false positive #723

Closed
eddelbuettel opened this Issue Jun 30, 2017 · 26 comments

Comments

Projects
None yet
3 participants
@eddelbuettel
Member

eddelbuettel commented Jun 30, 2017

What can we do when .registration=TRUE and we now generate globally-visible identifiers that have to be R and C++ compatible?

Under the default exporting regexp of exportPattern("^[[:alpha:]]+") these are visible, so R nags.

We cannot prefix with dot or underscore because (I think) in either case one language whines. But if we do nothing, R whines. Shall we (auot-)create an Rd (or roxygen) stub?

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jun 30, 2017

Member

I don't fully get the scenario so hard to say. What's a concrete example of a function that triggers this error?

I personally don't love the exportPattern("^[[:alpha:]]+") as there are typically more non-exported than exported functions in a package and this forces users into naming functions with dot prefixes. But I realize it's common convention so something we need to deal with.

We could always generate a single catch-all Rd file sort of like roxygen does for re-exports. e.g. https://github.com/rstudio/keras/blob/master/man/reexports.Rd

Member

jjallaire commented Jun 30, 2017

I don't fully get the scenario so hard to say. What's a concrete example of a function that triggers this error?

I personally don't love the exportPattern("^[[:alpha:]]+") as there are typically more non-exported than exported functions in a package and this forces users into naming functions with dot prefixes. But I realize it's common convention so something we need to deal with.

We could always generate a single catch-all Rd file sort of like roxygen does for re-exports. e.g. https://github.com/rstudio/keras/blob/master/man/reexports.Rd

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jun 30, 2017

Member

I don't fully get the scenario so hard to say. What's a concrete example of a function that triggers this error?

It hit me during the class. If we use the package generators (either via Rcpp*.package.skeleton() or the equivalent generators in RStudio) then we get a wild-carded exportPattern(...). Personally I am fine editing it but for newbies it sucks.

Agreed on the re-exports file. I know how to write those in Rd, I failed to get it done with roxygenize() when I tried earlier today on the train. I guess the .Rd route is the way to go then.

Member

eddelbuettel commented Jun 30, 2017

I don't fully get the scenario so hard to say. What's a concrete example of a function that triggers this error?

It hit me during the class. If we use the package generators (either via Rcpp*.package.skeleton() or the equivalent generators in RStudio) then we get a wild-carded exportPattern(...). Personally I am fine editing it but for newbies it sucks.

Agreed on the re-exports file. I know how to write those in Rd, I failed to get it done with roxygenize() when I tried earlier today on the train. I guess the .Rd route is the way to go then.

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 1, 2017

Member

It seems like .registration = TRUE and exportPattern("^[[:alpha:]]+") together is an anti-pattern, as it will result in spamming the search path with a bunch of intermediate generated functions with awkward names (i.e. prefaced with pkgname_).

I wonder if we should move away from generating packages with exportPattern("^[[:alpha:]]+") entirely. I agree that it's trickier for beginners to get that they need to add exports to NAMESPACE, but the alternative that they never learn about NAMESPACE and have to use dot prefacing for all private functions seems just as bad.

Member

jjallaire commented Jul 1, 2017

It seems like .registration = TRUE and exportPattern("^[[:alpha:]]+") together is an anti-pattern, as it will result in spamming the search path with a bunch of intermediate generated functions with awkward names (i.e. prefaced with pkgname_).

I wonder if we should move away from generating packages with exportPattern("^[[:alpha:]]+") entirely. I agree that it's trickier for beginners to get that they need to add exports to NAMESPACE, but the alternative that they never learn about NAMESPACE and have to use dot prefacing for all private functions seems just as bad.

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 1, 2017

Member

I hear you (and I wholeheartedly agree about the anti-pattern).

But the pattern is what R does by default. I am a little hesitant to move away from it.

I think ultimately the best may be to have two regexps: all matching 'x' yet excluding to cover our per-package pattern. Defensible idea?

But until then and for maybe some time auto-generate help stubs?

Member

eddelbuettel commented Jul 1, 2017

I hear you (and I wholeheartedly agree about the anti-pattern).

But the pattern is what R does by default. I am a little hesitant to move away from it.

I think ultimately the best may be to have two regexps: all matching 'x' yet excluding to cover our per-package pattern. Defensible idea?

But until then and for maybe some time auto-generate help stubs?

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 1, 2017

Member
Member

jjallaire commented Jul 1, 2017

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 2, 2017

Member
Member

jjallaire commented Jul 2, 2017

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 2, 2017

Member

Nice work. I like this a lot. And possibly a generic issue not just for Rcpp but everything with the same (default) regexp and .registration=TRUE ?

Do you have a moment to send a mail to r-devel? Else I can do it once I settled in a little here.

Member

eddelbuettel commented Jul 2, 2017

Nice work. I like this a lot. And possibly a generic issue not just for Rcpp but everything with the same (default) regexp and .registration=TRUE ?

Do you have a moment to send a mail to r-devel? Else I can do it once I settled in a little here.

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 2, 2017

Member

Ok, I think I am loosing my marbles. Look for example at this src/init.c file in RcppAnnoy. We use global exports, we have no extra aliases, and we get no warnings from R CMD check --as-cran because the Rcpp Modules functions are exported as _rcpp_modules_*. I think I tried this 'by hand' with another package. Can we not prefix the generated symbols with _rcpp* as well?

Member

eddelbuettel commented Jul 2, 2017

Ok, I think I am loosing my marbles. Look for example at this src/init.c file in RcppAnnoy. We use global exports, we have no extra aliases, and we get no warnings from R CMD check --as-cran because the Rcpp Modules functions are exported as _rcpp_modules_*. I think I tried this 'by hand' with another package. Can we not prefix the generated symbols with _rcpp* as well?

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 2, 2017

Member
Member

jjallaire commented Jul 2, 2017

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 2, 2017

Member
Member

jjallaire commented Jul 2, 2017

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 2, 2017

Member

Sorry for the line noise. I had meant to test just this (ie using underscores) 'by hand' last week but I must have messed something up. I can confirm that this use in RcppAnnoy leads to zero complaints from R 3.4.0 and R-devel (as of a few days ago).

Getting this fix into the July release would be great. We should have time.

Member

eddelbuettel commented Jul 2, 2017

Sorry for the line noise. I had meant to test just this (ie using underscores) 'by hand' last week but I must have messed something up. I can confirm that this use in RcppAnnoy leads to zero complaints from R 3.4.0 and R-devel (as of a few days ago).

Getting this fix into the July release would be great. We should have time.

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 2, 2017

Member
Member

jjallaire commented Jul 2, 2017

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 2, 2017

Member

Right. That must be what I ran into -- R complains about _. The backtick-underscore solution is the best I can think of.

What other options do we have? Complain to r-devel?

Member

eddelbuettel commented Jul 2, 2017

Right. That must be what I ran into -- R complains about _. The backtick-underscore solution is the best I can think of.

What other options do we have? Complain to r-devel?

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 2, 2017

Member
Member

jjallaire commented Jul 2, 2017

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 2, 2017

Member

I'd like that idea. I'll talk to Martyn about this in the next few days.

Member

eddelbuettel commented Jul 2, 2017

I'd like that idea. I'll talk to Martyn about this in the next few days.

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 9, 2017

Member

The best idea seems to be change the exportPattern() to have two arguments where the second, optional one, would allow to specify an exclude pattern. Obviously, someone (us?) would have to write that first.

Member

eddelbuettel commented Jul 9, 2017

The best idea seems to be change the exportPattern() to have two arguments where the second, optional one, would allow to specify an exclude pattern. Obviously, someone (us?) would have to write that first.

@jjallaire

This comment has been minimized.

Show comment
Hide comment
@jjallaire

jjallaire Jul 10, 2017

Member

We can actually already approximate this (albeit awkwardly) via the exportPattern regex (which doesn't use perl = TRUE so can't do lookahead). For example, if we wanted the prefix to be r_nr_ then we could use this export pattern:

exportPattern("^(([^r])|(r[^_])|(r_[^n])|(r_n[^r])|(r_nr[^_]))[[:alpha:]]+")

But of course that wouldn't work with existing instances of exportPattern, and the code we currently have does (because it uses only a _ prefix_. So I'd say if we need to generate new code in NAMESPACE we have what we need for that already (via the regex) and if we don't want to ask users to edit their NAMESPACE then we have a workable solution.

I still think that it can't possibly be intentional that native routine objects (which are not functions and only usable within a .Call) are export by default from R packages generated with package.skeleton. I think an automatic exclusion is warranted, but if that's not going to happen then I think we can be satisfied with either our current workaround or one based on a more elaborate regex in exportPattern.

Member

jjallaire commented Jul 10, 2017

We can actually already approximate this (albeit awkwardly) via the exportPattern regex (which doesn't use perl = TRUE so can't do lookahead). For example, if we wanted the prefix to be r_nr_ then we could use this export pattern:

exportPattern("^(([^r])|(r[^_])|(r_[^n])|(r_n[^r])|(r_nr[^_]))[[:alpha:]]+")

But of course that wouldn't work with existing instances of exportPattern, and the code we currently have does (because it uses only a _ prefix_. So I'd say if we need to generate new code in NAMESPACE we have what we need for that already (via the regex) and if we don't want to ask users to edit their NAMESPACE then we have a workable solution.

I still think that it can't possibly be intentional that native routine objects (which are not functions and only usable within a .Call) are export by default from R packages generated with package.skeleton. I think an automatic exclusion is warranted, but if that's not going to happen then I think we can be satisfied with either our current workaround or one based on a more elaborate regex in exportPattern.

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 10, 2017

Member

I was thinking about tuning the pattern in the regexp, but clearly not hard enough.

Member

eddelbuettel commented Jul 10, 2017

I was thinking about tuning the pattern in the regexp, but clearly not hard enough.

@mattfidler

This comment has been minimized.

Show comment
Hide comment
@mattfidler

mattfidler Jul 20, 2017

Hi, I was wondering about the use of R_forceSymbols. If you use this solution, then R cannot use the R_forceSymbols. Perhaps there should be a readme about this...?

mattfidler commented Jul 20, 2017

Hi, I was wondering about the use of R_forceSymbols. If you use this solution, then R cannot use the R_forceSymbols. Perhaps there should be a readme about this...?

@mattfidler

This comment has been minimized.

Show comment
Hide comment
@mattfidler

mattfidler Jul 20, 2017

To be more clear, you cannot use R_forceSymbols(dll,TRUE) beacuase R does not understand symbols that begin with _

mattfidler commented Jul 20, 2017

To be more clear, you cannot use R_forceSymbols(dll,TRUE) beacuase R does not understand symbols that begin with _

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 20, 2017

Member

Can you quote-protect / backtick-protect the underscore? Eg from one of my packages:

getDepends <- function(regexp = ".") {
    .Call('_RcppAPT_getDepends', PACKAGE = 'RcppAPT', regexp)
}

and with that I noticed I did not yet have .registration=TRUE there. Once done this becomes

getDepends <- function(regexp = ".") {
    .Call(`_RcppAPT_getDepends`, regexp)
}
Member

eddelbuettel commented Jul 20, 2017

Can you quote-protect / backtick-protect the underscore? Eg from one of my packages:

getDepends <- function(regexp = ".") {
    .Call('_RcppAPT_getDepends', PACKAGE = 'RcppAPT', regexp)
}

and with that I noticed I did not yet have .registration=TRUE there. Once done this becomes

getDepends <- function(regexp = ".") {
    .Call(`_RcppAPT_getDepends`, regexp)
}
@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 20, 2017

Member

Never mind, I guess, because R_forceSymbols() would be used in C / C++ code.

Do you have a reproducible example of something breaking / no longer working?

I just read up on in Writing R Extension. Are you suggesting we should add R_forceSymbols(dll, TRUE)? For kicks I just added it to the package used for the illustration here (RcppAPT) and it makes no difference. Installs and tests with and without it. I see no extra benefit, really,

Member

eddelbuettel commented Jul 20, 2017

Never mind, I guess, because R_forceSymbols() would be used in C / C++ code.

Do you have a reproducible example of something breaking / no longer working?

I just read up on in Writing R Extension. Are you suggesting we should add R_forceSymbols(dll, TRUE)? For kicks I just added it to the package used for the illustration here (RcppAPT) and it makes no difference. Installs and tests with and without it. I see no extra benefit, really,

@mattfidler

This comment has been minimized.

Show comment
Hide comment
@mattfidler

mattfidler Jul 20, 2017

I did not have a reproducable example, but saw that the new Rcpp exports were prefixed by underscore and assumed that R_forceSymbols would not work.

I'm not sure about the benefits of forcing symbols.

Writing R extensions suggests that forcing symbols is better (though I cannot see how either). There is a risk that the CRAN/R team will eventually force what is suggested as optional, like they did with package registration.

It is clear that .Call(_function_name) won't work since _function_name is not a valid R symbol. Upon testing, .Call('_function_name',...) with backquotes works just fine.

So this is an issue of my understanding, (and not seeing the backquote when looking at the updated code).

Does Rcpp now generate these registration symbols? Or you talking about the package skeleton that one of the R utilities does?

I think you should decide if R_forceSymbols(dll, TRUE) is worthwhile. I'm undecided, but since it is suggested, I personally include it.

mattfidler commented Jul 20, 2017

I did not have a reproducable example, but saw that the new Rcpp exports were prefixed by underscore and assumed that R_forceSymbols would not work.

I'm not sure about the benefits of forcing symbols.

Writing R extensions suggests that forcing symbols is better (though I cannot see how either). There is a risk that the CRAN/R team will eventually force what is suggested as optional, like they did with package registration.

It is clear that .Call(_function_name) won't work since _function_name is not a valid R symbol. Upon testing, .Call('_function_name',...) with backquotes works just fine.

So this is an issue of my understanding, (and not seeing the backquote when looking at the updated code).

Does Rcpp now generate these registration symbols? Or you talking about the package skeleton that one of the R utilities does?

I think you should decide if R_forceSymbols(dll, TRUE) is worthwhile. I'm undecided, but since it is suggested, I personally include it.

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Jul 20, 2017

Member

There is a fairly extensive discussion in "Writing R Extensions" which should answer all your questions. If you have additional questions, please consider the r-devel or r-package-devel lists as none of this is really specific to Rcpp.

We are simply helping to prepare user code for the now-recommended ways of interfacing R. And there is no issue here -- we went through the required changes (ie the backtick to protect the underscore) for the recent Rcpp 0.12.12 release.

Member

eddelbuettel commented Jul 20, 2017

There is a fairly extensive discussion in "Writing R Extensions" which should answer all your questions. If you have additional questions, please consider the r-devel or r-package-devel lists as none of this is really specific to Rcpp.

We are simply helping to prepare user code for the now-recommended ways of interfacing R. And there is no issue here -- we went through the required changes (ie the backtick to protect the underscore) for the recent Rcpp 0.12.12 release.

@mattfidler

This comment has been minimized.

Show comment
Hide comment
@mattfidler

mattfidler Jul 20, 2017

That is fine. Thank you.

mattfidler commented Jul 20, 2017

That is fine. Thank you.

@eddelbuettel

This comment has been minimized.

Show comment
Hide comment
@eddelbuettel

eddelbuettel Aug 7, 2017

Member

This can be closed now too with Rcpp 0.12.12 on CRAN,

Member

eddelbuettel commented Aug 7, 2017

This can be closed now too with Rcpp 0.12.12 on CRAN,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment