-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement residualized predictor permutation as an option in anova()
/permutest()
methods for CCA/RDA/dbRDA
#542
Comments
I read the paper, and it makes sense. Now the question is how to implement this in vegan. |
An intriguing detail is that CCA permutation tests re-weighted X-variates by permuted community weights while we used R-code, or up to vegan release 2.4-6 ( |
Here a quick test that verifies that the current There is no similar problem in unweighted analysis (RDA), and indeed, vegan 2.4-6 also seems to be unbiased. The tests became biased when I implemented them in C and dropped weighting as specified in the commit message in the previous message. I don't remember if I dropped re-weighting scheme before making it to work. However, building blocks for re-weighted simulations should be available in old github commits. |
Cajo's paper concerns CCA or permutation on weighted ordination. The crux is that Canoco and vegan 2.5-1 to 2.6-4 ignored weights and just permuted the internal matrices ("working data"). However, when we permute predictors, they should be re-weighted using the weights of their new sampling units. I have now implemented this in branch biased-anova-cca, and the new results seem to be unbiased under Null model. This means that permutation P-values have uniform distribution with randomized or random predictors. The implementation is different from the one Cajo outlined, because we use completely different algebraic tool set (QR decomposition, different handling of predictors in parallel models), although the results are equal. Up to CRAN release 2.4-6 vegan permutation tests had re-weighting (and were not Canoco compliant), and the first version with current non-reweighted permutations was 2.5-1 (Apr 14, 2018). This switch was made together with moving from R code to C in permutation tests. However, I first started to implement re-weighted permutation in C, but then decided to go for simpler Canoco-compliant code and removed the re-weighting code. So the greatest change in this branch was reverting those two commits in 0fbded3 and 740f434. This gave working compiled code, but In this process I also found a noteworthy inefficiency in design of There is still one issue that needs scrutiny and fix, but it seems that we can live with this – if we are careful. It seems that the C function changes input, and if called repeatedly with the same data, the data will change and results will drift. This happens only in weighted analysis (CCA), but I haven't found the reason for this. This drift was the thing that took most of my time: I have parallel processing as default in my working environment, and my tests consistently failed, and I was looking the reason in the C code (which proved to be correct) or in update 5/11/22: We cannot live with this. Tests fail also with update 6/11/22: QR decomposition must be duplicated in |
I have finally finished with upgraded Cajo's test only concerned coverage: 5 % or randomized P-values should be at P < 0.05. I also looked at the full distribution of permutation P-values which should have uniform distribution with randomized data. I will merge this change to master branch, but I will still have a look at two issues (and more issues may appear in testing):
|
This implements re-weighted permutation tests with residualized permutations as suggested by ter Braak, C.J.F. & te Beest, D.E. Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal. This also fixes some smaller details, such as clumsy interface that made parallel execution slower than serial execution. Environ Ecol Stat 29, 849–868 (2022). merge is necessary, # especially if it merges an updated upstream into a topic branch.
Cajo has a new paper (ter Braak & te Beest, 2022) out showing that the residualized response permutation method in Canoco 5 (versions < 5.15) and vegan don't work so well (have grossly inflated Type I error rates) in situations where (quoted from abstract)
{ade4} has response permutation and so it isn't affected but it can't test partial ordinations. So, either we warn people about the specific issues of overdispersion in the presence of highly variable site totals (and we know how that will go down - they won't read the help and/or will ignore warnings we print) or we do that and implement the residualized response permutation method that Cajo & co describe in the paper. There's R code associated with the paper so implementing shouldn't be hard, though might need converting to vegan-like R code (? I don't know, haven't looked at it yet) and checking on the licence under which it was distributed (or asking Cajo for permission to include it in vegan), or we write our own implementation from the paper description.
(@jarioksa if you don't have access to the paper, let me know and I'll send the PDF your way)
References
ter Braak, C.J.F., te Beest, D.E. Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal. Environ Ecol Stat 29, 849–868 (2022). https://doi.org/10.1007/s10651-022-00545-4
The text was updated successfully, but these errors were encountered: