Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting rid of R+Rcpp dependencies #20

Closed
tnagler opened this issue Jan 15, 2017 · 13 comments
Closed

Getting rid of R+Rcpp dependencies #20

tnagler opened this issue Jan 15, 2017 · 13 comments

Comments

@tnagler
Copy link
Collaborator

tnagler commented Jan 15, 2017

I wrote my unit tests in #18 and #19 by hardcoding input values and VineCopula output - mainly because I was to lazy to deal with the R interface (we will remove it at some point anyway).

It is a bit ugly, but there's little reason not to do it. The tests catch 99% of the errors the R interface would and can be used even with R interface removed. A cleaner version of this would be to write an R script that writes inputs and results into a file which is loaded from within the tests.

What do you think?

@tvatter
Copy link
Collaborator

tvatter commented Jan 20, 2017

I completely agree that it would be nice to remove the dependencies. However, the R interface helped me a lot when writing the pdf/hfunc/hinv for parametric families to benchmark against VineCopula and I think that it can still be useful in the development phase.

I like the idea of using an R script that writes inputs and outputs into files, but there may be a "size" problem. Currently, we are running 5 tests (pdf/hfunc1-2/hinv1-2, not counting tau/loglik) for 4+3*4 families (4 rotationless and 3 with rotations), i.e. 80 tests, for a sample size of 10'000, which means that the output would be a relatively large file (~15MB on my computer)... So we would need to run the R script and clean-up the files after the test is run (i.e., we would still need the R dependencies). Furthermore, we would need still a way to let C++ pass arguments to the R script (at least as long as we're in the development phase).

@tnagler
Copy link
Collaborator Author

tnagler commented Jan 20, 2017

Is it really necessary to run the tests with a sample size of 10'000? If there's a bug in the code, it will probably also show up when n = 10. The only thing that such a large sample size adds is to (randomly) test the stability of the C++ implementaiton near the boundaries. But this could be done with a few deterministic points as well.

As for passing arguments to R script. We could write a separate setup file which is loaded from both C++ and the R-script.

@tvatter
Copy link
Collaborator

tvatter commented Jan 20, 2017

Concerning the sample size: I agree that it may not be necessary for the tests of pdf/hfunc1-2/hinv1-2 and that ~100 carefully chosen points may be enough. In fact, I increased the sample size to 10'000 to properly test bicop_sel, that is to almost guarantee that the right family was chosen (i.e., the probability of random build failure is small). However, if you find a way to carefully create a "small" sample (for each of the 16 tested families) that is "representative enough" (i.e., such that the the right family is chosen), I'm OK with that!

I also agree with the separate setup file. However, we would still depend on R or am I missing something?

@tnagler
Copy link
Collaborator Author

tnagler commented Jan 20, 2017

We would "depend" on R to create the "expected results" file. But my idea was to do that manually once, not re-creating it every time we build the project. So the C++ project and tests do not depend on R, just on this one file that we ship with the library.

@tvatter
Copy link
Collaborator

tvatter commented Jan 20, 2017

I'm OK with this solution, but I would still ship the R script and setup file with the library (so that one can check and/or modify if needed).

@tnagler
Copy link
Collaborator Author

tnagler commented Jan 20, 2017

Yes!

@slayoo
Copy link
Collaborator

slayoo commented Feb 8, 2017

What about creating three CMake projects within the vinecopulib repository:

  • the lib + non-R-dependent tests
  • the R-dependent tests
  • the python bindings

People not interested in Python or tests would not need to be bothered with those at all.
Generating a build on a platform where you cannot get R or Python would be straightforward (no hacks as in the Appveyor branch)

@tvatter
Copy link
Collaborator

tvatter commented Feb 8, 2017

First, I think that creating something for the non-R-dependent tests is not very useful right now, since we'll remove the R dependencies altogether at some point (see my answer to the ASAN issue).

Second, about the Python bindings: couldn't we include this as an option to the main CMake project that would be passed using e.g. -DCMAKE_BUILD_PYTHON_BINDINGS=TRUE ? Or do you think that having two CMake projects make more sense (and if that is the case, why) ?

@tnagler
Copy link
Collaborator Author

tnagler commented Feb 8, 2017

Hmm, separating the R dependent tests from the lib + non-R-dependent tests doesn't sound too bad. That way, we don't have an R dependency for users, but still have the interface available for unit testing. This would make it a lot less urgent to remove the R dependency (if necessary at all), for which I won't have time in the next 2-3 weeks.

@slayoo
Copy link
Collaborator

slayoo commented Feb 8, 2017

Two advantages I can think of are:

  • simplicity: both the library and bindings CMake stuff would be simpler
  • we would be forced to write and test CMake code for handling find_package(vinecopulib) which would be of great benefit for the users - the bindings would then serve as one of examples of client code

... but certainly there are other more urgent things - this comment was rather an afterthought from the fights with Travis and Appveyor :)

@tnagler
Copy link
Collaborator Author

tnagler commented Mar 8, 2017

To summarize what we discussed in the last weeks:

  • The python bindings will get their own repository as soon they are reasonably mature (soon).
  • R-related tests will get a CMake switch, so the library can be build (and partially tested) without the R dependencies.

@tvatter tvatter mentioned this issue Mar 10, 2017
19 tasks
@tnagler
Copy link
Collaborator Author

tnagler commented Mar 14, 2017

So now we have another solution. Rcpp, RcppEigen, RInside dependencies are removed. So only R is required as dependency. This was addressed by 366a339 with some bug-fixes following (latest b3be63f). It's probably still a good idea to make a CMake flag, because R is not necessary for using the library.

@tvatter
Copy link
Collaborator

tvatter commented Mar 15, 2017

As discussed with Thomas, I added dealt with this issue using the standard CMake flag to avoid compiling unit tests (i.e., DBUILD_TESTING). Since almost every unit test require R, this serves the same purpose, and we simply avoid requiring rscript in findR.cmake when using -DBUILD_TESTING=OFF.

See 94ef873 and #59.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants