Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we generate extendr-wrappers.R at the same time of compiling Rust code? #56

Closed
yutannihilation opened this issue Mar 6, 2021 · 30 comments

Comments

@yutannihilation
Copy link
Contributor

As described on helloextendr's README, the current steps to produce the wrapper are:

# Compile the Rust code (If you are using RStudio, just run "Install and Restart")
devtools::install()

# Re-generate wrappers
rextendr::register_extendr()

# Re-generate documentation and NAMESPACE (If you are using RStudio, just run "Document")
devtools::document()

But, I come to wonder if explicit rextendr::register_extendr() is really needed. Can we generate it on compilation by tweaking some build.rs? (Or Makevars?)

@yutannihilation
Copy link
Contributor Author

by tweaking some build.rs?

Apparently, it cannot because build.rs is what is executed before compilation.

But, Makevars works. It's just creating main.rs like this and do cargo run in Makevars. Can we switch to this?

fn main() {
    let metadata = helloextendr::get_helloextendr_metadata();
    print!(
        "{}",
        metadata.make_r_wrappers(true, "helloextendr").unwrap()
    );
}

full commit: yutannihilation/helloextendr@e556d81

@yutannihilation
Copy link
Contributor Author

@clauswilke @andy-thomason @Ilia-Kosenkov
Sorry I think I didn't follow the discussion on the wrapper-generation completely, but are there any reason the wrapper generation needs to be called from R session? The Makevars-method I described above seems to work, and I'm wondering if there's something I misunderstand.

@clauswilke
Copy link
Member

Yes, your Makevars method should work also. It's just a matter of preference. Do you want more boilerplate Rust code or do you want an extra R call?

Also, have you tested your method on Windows? Didn't we have problems with cargo run there or has that been solved? And on Windows, we'll generate the wrappers twice, but maybe that's not a big deal.

@yutannihilation
Copy link
Contributor Author

It's just a matter of preference. Do you want more boilerplate Rust code or do you want an extra R call?

I don't think this is just my preference. This matters because if this works the current steps:

# Compile the Rust code (If you are using RStudio, just run "Install and Restart")
devtools::install()

# Re-generate wrappers
rextendr::register_extendr()

# Re-generate documentation and NAMESPACE (If you are using RStudio, just run "Document")
devtools::document()

can be reduced to just one step (currently, pkgbuild::needs_compile() doesn't consider .rs files, but I think it's possible as it's just add .rs here: https://github.com/r-lib/pkgbuild/blob/d9717aad6ff0534846b36a827e8444174dc83a1a/R/compile-dll.R#L115):

devtools::document()

The problem with cargo run is a good point, thanks. I guess we can just cargo build and execute the binary, but I need to test it.

@yutannihilation
Copy link
Contributor Author

Ah, just remembered the problem with cargo run happens when it tries to cross-compile for 32 bit, so I guess it won't be a problem. Probably we can just skip it on 32 bit Windows because it's only needed on development, not on installation. But anyway I'll check it.

@clauswilke
Copy link
Member

One other issue to think about: Do we have potential problems with the path where the Makevars approach is executed? register_extendr() works in the package source directory, by definition. The Makevars approach works in the build directory. Are there workflows where that is not the package directory? (E.g., if people don't use devtools()? Do such people exist?)

@clauswilke
Copy link
Member

clauswilke commented Mar 6, 2021

One last issue: We have had discussions about more complex wrapper code, see here:
extendr/extendr#184
If we move the generation of the wrapper code into Makevars then we need to generate this code on the Rust side. I'm still not sure I like this idea.

One other thought: Things are a bit awkward with the current devtools::document(), but is that in part because it doesn't know anything about rextendr? Could we modify devtools::document() so it calls register_extendr() when needed? Doesn't it do the equivalent for cpp11?

@clauswilke
Copy link
Member

Would like to get @dfalbel's perspective also.

@yutannihilation
Copy link
Contributor Author

Do we have potential problems with the path where the Makevars approach is executed?

I don't think there's any problem as it's simple as just redirecting stdout to R/extendr-wrappers.R. If the user is skilled enough to rearrange the package structure other than the one use_extendr() creates, they should figure out what to do.

(E.g., if people don't use devtools()? Do such people exist?)

Sorry, I don't get the point here. Why does the use of devtools() matter?

If we move the generation of the wrapper code into Makevars then we need to generate this code on the Rust side. I'm still not sure I like this idea.

Sorry, I don't understand this point either... In either way, the wrapper code is generated on Rust's side. The difference is how <Metadata>.make_r_wrappers() is called, directly by the executable or via the C API exposed to R. Am I wrong?

Things are a bit awkward with the current devtools::document(), but is that in part because it doesn't know anything about rextendr? Could we modify devtools::document() so it calls register_extendr() when needed? Doesn't it do the equivalent for cpp11?

No, I'd say they are different in that register_cpp11() works without installing or loading the package.

I would emphasize this point. The current rextendr's workflow has a chicken-egg problem; if R/extendr-wrapper.R is accidentally lost, the user cannot recover. To generate the wrapper, we need at least useDynLib declared, but it would be declared in R/extendr-wrapper.R. The Makevars approach can avoid this; what the user needs to do is just compiling.

If we want "the equivalent for cpp11", I guess we need to build another infrastructure for parsing the Rust code from R's side, just like decor package does for cpp11: https://github.com/r-lib/decor. This would be a very different approach than the current one.

@clauswilke
Copy link
Member

Sorry, I don't get the point here. Why does the use of devtools() matter?

If you follow the instructions in the first section of this post, for example, I don't think the wrappers would end up in the right location: https://kbroman.org/pkg_primer/pages/build.html

In either way, the wrapper code is generated on Rust's side.

We're talking about more wrapper code that doesn't exist yet. It's not a given that it will be generated on the Rust side. There is no reason that it has to, I believe.

@yutannihilation
Copy link
Contributor Author

R CMD build and R CMD INSTALL, or devtools::build() and devtools::install(), are the steps to install the package, not to develop it. The developer need to run devtools::document() to generate wrappers, anyway. (So, I think we cannot support those who don't use devtools. c.f. usethis::use_cpp11() fails if the user doesn't use roxygen to generate documents).

There is no reason that it has to, I believe.

Yes, I agree with you here. So, are you arguing that it might be easier to tweak the generated R code programmatically if we do it in register_extedr()? I feel we can do the similar thing by Makevars approach, but this seems a good point. Thanks.

@clauswilke
Copy link
Member

are you arguing that it might be easier to tweak the generated R code programmatically if we do it in register_extendr()?

Yes, I've never been that excited about having compiled Rust code spit out R code.

I'm aware of the chicken-and-egg problem and I don't like it but I also think we shouldn't parse Rust code in R if we can avoid it.

Do you know why devtools::document() compiles the Rust source code? It seems to me that since this happens there should be a way to take advantage of it, possibly with some modification to that function.

I think more generally we should take a bigger picture perspective and not worry so much about how things work with the current devtools::document() but how they could work if we propose some modifications there. If we could somehow register a function that is run after the package is compiled but before the documentation is generated that is all we need.

@clauswilke
Copy link
Member

I think this is the explanation for why the package gets compiled: https://roxygen2.r-lib.org/reference/load.html
Maybe there's a way to piggy-back on this process.

@yutannihilation
Copy link
Contributor Author

Okay, I think I'm getting your point...

If we could somehow register a function that is run after the package is compiled but before the documentation is generated that is all we need.

I think it's here, which is called inside load_all() after compile_dll(). Currently it only supports cpp11 and Rcpp, but I guess we can propose some option to insert arbitrary functions here.

https://github.com/r-lib/pkgbuild/blob/2aaecf838eb023b788b7a31e267ac644d616dae0/R/c-registration.R#L1-L12

@clauswilke
Copy link
Member

We could certainly write a function rextendr::document() that makes sure extendr-wrapper.R exists or otherwise adds a minimal version, then compiles the package and generates the wrappers, and then documents. If we're clever we may need only one compilation of the source code, since the compiled code doesn't change with a new wrapper file.

I think all of this comes down to mostly writing a new load function that does the following three steps:

  1. pkgload::load_all()
  2. rextendr::register_extendr()
  3. source the newly generated extendr wrappers so they overwrite the old code loaded with load_all()

@clauswilke
Copy link
Member

I typed out my comment while you typed yours. Yes, modifying update_registration() in pkgbuild may be the way to go.

@clauswilke
Copy link
Member

Actually, not so fast. I think update_registration() is called before compilation, but we need something that gets called after.

https://github.com/r-lib/pkgbuild/blob/d9717aad6ff0534846b36a827e8444174dc83a1a/R/compile-dll.R#L37-L54

@yutannihilation
Copy link
Contributor Author

Ah, true. Hmm...

@clauswilke
Copy link
Member

Maybe it's my long-term use of LaTeX, but I'm totally comfortable with keeping it at this for now:

devtools::install(quick = TRUE)
rextendr::register_extendr()
devtools::document()

And register_extendr() could have an optional feature that forces a minimal extendr-wrappers.R if none exists, so that recovery of a broken package would work as follows:

rextendr::register_extendr(force_wrappers = TRUE)
devtools::install(quick = TRUE)
rextendr::register_extendr()
devtools::document()

Then, the next step might be to implement rextendr::document() that does the above more efficiently, and finally we could go to the devtools developers and ask if they're willing to modify their code to accommodate us.

@yutannihilation
Copy link
Contributor Author

Hmm, okay. I feel I'll probably end up generating the wrappers in onLoad(), but I'll try to stick with it for a while.

c.f. https://stackoverflow.com/a/32035219

@yutannihilation
Copy link
Contributor Author

Reflect quick = TRUE to helloextendr's README: extendr/helloextendr@8fd66d9

@yutannihilation
Copy link
Contributor Author

I'm not fully convinced, but I also feel I probably cannot convince other members with my idea. I'm closing this and will file a new one for rextendr::document(). Let's stick with the current steps with manual rextendr::register_extendr() for now. Sorry for the noise.

@clauswilke
Copy link
Member

clauswilke commented Mar 10, 2021

I'm going to reopen this issue. I'm now quite convinced that there is no reliable way that an R function can build a package, load the dynamic library, and then call a function from it. This sequence of steps is only reliable if R gets restarted in-between, and that's not something an R function can (or should) trigger. Without restart, there's always the risk that the shared library doesn't get updated.

So, we need to look for other alternatives. Building the wrappers during compile is attractive. However, I can also now better formulate what was bothering me about the proposed approach: I don't think a compile should overwrite the R package sources. That's just too much magic.

Therefore, how about the following: What if the compile just generates the wrappers and leaves them in target, using the technique you proposed. Then, register_extendr() can fish the wrappers out from there and place them into R/extendr-wrappers.R. And, if register_extendr() doesn't find any wrappers in target, it could alternatively try to call the shared library and get wrappers that way. This makes the whole sequence more robust in case somebody doesn't want to build the wrappers during compile, for whatever reason.

@clauswilke clauswilke reopened this Mar 10, 2021
@yutannihilation
Copy link
Contributor Author

I don't think a compile should overwrite the R package sources. That's just too much magic.

Thinking about this issue again, I now agree with this part.

Yet another alternative might be to call cargo run (or compiled executable binary) from register_extendr(). but fishing the wrapper is probably generally useful (e.g. when the developer needs to tweak the generated wrapper).

@clauswilke
Copy link
Member

Yet another alternative might be to call cargo run (or compiled executable binary) from register_extendr().

I thought about it. In theory it's possible to generate the main.rs code on the fly and compile and run the binary when register_extendr() is called. In practice though it's probably too complicated, and may interfere with a main.rs the developer has placed there for some other reason. I think it's better to set this up as part of the template structure and have it run with every build.

@yutannihilation
Copy link
Contributor Author

To be clear, I think it doesn't need to be main.rs. We can use bin/<whatever name>.rs and do cargo run --bin <whatever name>. But anyway, in that case, probably we need to place generated <whatever name>.rs into bin/ and then remove it, which sounds unsafe. So I have no objection.

c.f. https://doc.rust-lang.org/cargo/guide/project-layout.html#package-layout

.
├── Cargo.lock
├── Cargo.toml
├── src/
│   ├── lib.rs
│   ├── main.rs
│   └── bin/
│       ├── named-executable.rs
│       ├── another-executable.rs
│       └── multi-file-executable/
│           ├── main.rs
│           └── some_module.rs
...snip...

@yutannihilation
Copy link
Contributor Author

Will you implement this?

@clauswilke
Copy link
Member

It's unlikely I'd get to it this week. So if you have time and want to give it a try please go ahead.

@yutannihilation
Copy link
Contributor Author

Sure, I'll try a PR.

@yutannihilation
Copy link
Contributor Author

We explored this option on #63, but we found it doesn't look nicer than rextendr::document() considering the complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants