Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster permutest.cca with compiled code #211

Closed
jarioksa opened this issue Oct 18, 2016 · 2 comments
Closed

Faster permutest.cca with compiled code #211

jarioksa opened this issue Oct 18, 2016 · 2 comments

Comments

@jarioksa
Copy link
Contributor

jarioksa commented Oct 18, 2016

I ran the test of vegan Examples in Rprof() and found out that we are very heavily using permutation tests for constrained ordination. The internal getF function within permutest.cca code used more than 25% computing time of testing vegan examples. Even small speed-up would be valuable in such a central function. I have had good success with new C code (minimum terms for designdist) and with improved user-interface to C (sequential null models in particular, and also other cases). The getF function is more tricky than other cases I have dealt with, because it calls high lever functions like QR decomposition and SVD, and all that should also be written in C, and that has scared me away from getF.

I decided to try writing a C function and now I am pretty confident this can be done, and that this is useful: speed-up is better than I hoped. I have now implemented testing RDA models. The new C function is a drop-in replacement to the current getF, and I have an experimental interface to select either old and new code with argument C (defaults FALSE). The following would implement both the old and new testing:

mod <- rda(dune ~ Condition(Management) + Moisture + A1, dune.env)
p <- shuffleSet(nrow(dune), 99) ## to guarantee same permutations in all analyses
sol0 <- permutest(mod, perm=p) ## using R code
sol1 <- permutest(mod, perm=p, C=TRUE) ## using compiled code

With the same permutations (or the same random number seed), the results should be identical within floating point accuracy. Items sol0$num, sol0$den and sol0$F.perm contain the numerator and denominator eigenvalues and F.perm the F statistic scaled by degrees of freedom. There is currently a full implementation for RDA only (including analysis of only first eigenvalues first = TRUE), but I intend to implement other cases with time. Timing has shown more than 2x speed-up in several cases in my systems.

The C code is oddly readable, but it will probably grow more messy when I add if-statements for weighted analysis (CCA) and distance-based analysis. Using QR decomposition with Linpack is pretty simple, but I must say that having such a simple thing as SVD with ready Lapack function took many more lines than I anticipated.

Please inspect, test, comment and improve. The new code is in branch do-getF. I haven't merged anything to the master yet and there is no pull request either. I expect things to mature first.

@jarioksa
Copy link
Contributor Author

PR #212 provides a drop-in replacement in C for permutations in permutest.cca. No R code was removed, and the same old R code is used by default. You have to set argument C = TRUE in permutest to invoke the use of compiled code (default is C = FALSE). The results should be identical within floating point accuracy in both ways. The compiled code will become the default and R code will be removed if this proves to be successful.

In my desktop the time used in permutest.cca dropped from 27% to 15% in running all R examples code of R CMD check. This means about 10% speed-up in executing examples when checking the package. There were some minor changes of magnitude 1e-15 in the results.

@jarioksa
Copy link
Contributor Author

jarioksa commented May 12, 2017

See branch anova-cca-use-permutest for latest development. This makes anova(..., by = "term") directly use permutest(..., by = "term") and adds a new alternative by = "seqaxis" for novel sequential test of axes using one-degree-of-freedom contrasts on axes in permutest.

@jarioksa jarioksa added this to the 2.5-0 milestone May 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant