You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran the test of vegan Examples in Rprof() and found out that we are very heavily using permutation tests for constrained ordination. The internal getF function within permutest.cca code used more than 25% computing time of testing vegan examples. Even small speed-up would be valuable in such a central function. I have had good success with new C code (minimum terms for designdist) and with improved user-interface to C (sequential null models in particular, and also other cases). The getF function is more tricky than other cases I have dealt with, because it calls high lever functions like QR decomposition and SVD, and all that should also be written in C, and that has scared me away from getF.
I decided to try writing a C function and now I am pretty confident this can be done, and that this is useful: speed-up is better than I hoped. I have now implemented testing RDA models. The new C function is a drop-in replacement to the current getF, and I have an experimental interface to select either old and new code with argument C (defaults FALSE). The following would implement both the old and new testing:
mod<- rda(dune~ Condition(Management) +Moisture+A1, dune.env)
p<- shuffleSet(nrow(dune), 99) ## to guarantee same permutations in all analysessol0<- permutest(mod, perm=p) ## using R codesol1<- permutest(mod, perm=p, C=TRUE) ## using compiled code
With the same permutations (or the same random number seed), the results should be identical within floating point accuracy. Items sol0$num, sol0$den and sol0$F.perm contain the numerator and denominator eigenvalues and F.perm the F statistic scaled by degrees of freedom. There is currently a full implementation for RDA only (including analysis of only first eigenvalues first = TRUE), but I intend to implement other cases with time. Timing has shown more than 2x speed-up in several cases in my systems.
The C code is oddly readable, but it will probably grow more messy when I add if-statements for weighted analysis (CCA) and distance-based analysis. Using QR decomposition with Linpack is pretty simple, but I must say that having such a simple thing as SVD with ready Lapack function took many more lines than I anticipated.
Please inspect, test, comment and improve. The new code is in branch do-getF. I haven't merged anything to the master yet and there is no pull request either. I expect things to mature first.
The text was updated successfully, but these errors were encountered:
PR #212 provides a drop-in replacement in C for permutations in permutest.cca. No R code was removed, and the same old R code is used by default. You have to set argument C = TRUE in permutest to invoke the use of compiled code (default is C = FALSE). The results should be identical within floating point accuracy in both ways. The compiled code will become the default and R code will be removed if this proves to be successful.
In my desktop the time used in permutest.cca dropped from 27% to 15% in running all R examples code of R CMD check. This means about 10% speed-up in executing examples when checking the package. There were some minor changes of magnitude 1e-15 in the results.
See branch anova-cca-use-permutest for latest development. This makes anova(..., by = "term") directly use permutest(..., by = "term") and adds a new alternative by = "seqaxis" for novel sequential test of axes using one-degree-of-freedom contrasts on axes in permutest.
I ran the test of vegan Examples in
Rprof()
and found out that we are very heavily using permutation tests for constrained ordination. The internalgetF
function withinpermutest.cca
code used more than 25% computing time of testing vegan examples. Even small speed-up would be valuable in such a central function. I have had good success with new C code (minimum terms fordesigndist
) and with improved user-interface to C (sequential null models in particular, and also other cases). ThegetF
function is more tricky than other cases I have dealt with, because it calls high lever functions like QR decomposition and SVD, and all that should also be written in C, and that has scared me away fromgetF
.I decided to try writing a C function and now I am pretty confident this can be done, and that this is useful: speed-up is better than I hoped. I have now implemented testing RDA models. The new C function is a drop-in replacement to the current
getF
, and I have an experimental interface to select either old and new code with argumentC
(defaultsFALSE
). The following would implement both the old and new testing:With the same permutations (or the same random number seed), the results should be identical within floating point accuracy. Items
sol0$num
,sol0$den
andsol0$F.perm
contain the numerator and denominator eigenvalues andF.perm
the F statistic scaled by degrees of freedom. There is currently a full implementation for RDA only (including analysis of only first eigenvaluesfirst = TRUE
), but I intend to implement other cases with time. Timing has shown more than 2x speed-up in several cases in my systems.The C code is oddly readable, but it will probably grow more messy when I add if-statements for weighted analysis (CCA) and distance-based analysis. Using QR decomposition with Linpack is pretty simple, but I must say that having such a simple thing as SVD with ready Lapack function took many more lines than I anticipated.
Please inspect, test, comment and improve. The new code is in branch do-getF. I haven't merged anything to the master yet and there is no pull request either. I expect things to mature first.
The text was updated successfully, but these errors were encountered: