Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upFaster permutest.cca with compiled code #211
Comments
|
PR #212 provides a drop-in replacement in C for permutations in In my desktop the time used in |
|
See branch anova-cca-use-permutest for latest development. This makes |
I ran the test of vegan Examples in
Rprof()and found out that we are very heavily using permutation tests for constrained ordination. The internalgetFfunction withinpermutest.ccacode used more than 25% computing time of testing vegan examples. Even small speed-up would be valuable in such a central function. I have had good success with new C code (minimum terms fordesigndist) and with improved user-interface to C (sequential null models in particular, and also other cases). ThegetFfunction is more tricky than other cases I have dealt with, because it calls high lever functions like QR decomposition and SVD, and all that should also be written in C, and that has scared me away fromgetF.I decided to try writing a C function and now I am pretty confident this can be done, and that this is useful: speed-up is better than I hoped. I have now implemented testing RDA models. The new C function is a drop-in replacement to the current
getF, and I have an experimental interface to select either old and new code with argumentC(defaultsFALSE). The following would implement both the old and new testing:With the same permutations (or the same random number seed), the results should be identical within floating point accuracy. Items
sol0$num,sol0$denandsol0$F.permcontain the numerator and denominator eigenvalues andF.permthe F statistic scaled by degrees of freedom. There is currently a full implementation for RDA only (including analysis of only first eigenvaluesfirst = TRUE), but I intend to implement other cases with time. Timing has shown more than 2x speed-up in several cases in my systems.The C code is oddly readable, but it will probably grow more messy when I add if-statements for weighted analysis (CCA) and distance-based analysis. Using QR decomposition with Linpack is pretty simple, but I must say that having such a simple thing as SVD with ready Lapack function took many more lines than I anticipated.
Please inspect, test, comment and improve. The new code is in branch do-getF. I haven't merged anything to the master yet and there is no pull request either. I expect things to mature first.