Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

permutest.cca breaks tied values (a case of R FAQ 7.31) #132

Closed
jarioksa opened this issue Aug 24, 2015 · 1 comment
Closed

permutest.cca breaks tied values (a case of R FAQ 7.31) #132

jarioksa opened this issue Aug 24, 2015 · 1 comment
Labels

Comments

@jarioksa
Copy link
Contributor

Sven Neulinger reported a problem with anova.cca with tied values. Here an excerpt of his email:

"Since anova.cca() uses permutation tests, there is a lower limit for the
P-value with small numbers of replicates per treatment group (n) even if
the F-value is very large [e.g., Clarke, K.R. 1993. Non-parametric
multivariate analysis of changes in community structure. Australian
Journal of Ecology 18: 117-143]. Specifically, with n=4 and 2 treatment
levels, the minimal P-value produced by a permutation-based ANOVA is
expected to be ~0.02857.

This outcome should be independent of the number of species (OTUs) in
the count table provided to anova.cca(). However, I noticed that
anova.cca() sometimes gives P-values far lower than what could be
expected."

The following code reproduces the problem:

A <- gl(2, 4)
set.seed(2)
y <- c(rnorm(4, mean=0), rnorm(4, mean=10))/7
mod <- rda(y ~ A)
# permutation matrix
p <- shuffleSet(8, control=how(complete = TRUE, nperm=40320, maxperm=50000))
anova(mod, permutations = p)
##Permutation test for rda under reduced model
##Permutation: free
##Number of permutations: 40319
##
##Model: rda(formula = y ~ A)
##       Df Variance      F   Pr(>F)    
##Model     1  0.60590 242.77 2.48e-05 ***
##Residual  6  0.01497                    

The reported P-value is the lowest possible value 1/(40319+1) meaning that all permutation F-values are lower than the observed statistic:

> out <- permutest(mod, permutations = p)
> max(out$F.perm) - out$F.0
        Model 
-9.947598e-13 
> sum(out$F.perm >= out$F.0)
[1] 0

We round to 12 decimal places in permutest.cca to avoid false breaking of ties, but obviously we need more rounding -- but how much? We do not want to falsely equalize different values. I think we should also use signif (giving the requested number of significant digits) instead of round (giving the requested number of decimals).

The fix should be simple, but I made this a public issue to raise awareness: permutest.cca is not the only function that is similarly affected.

@jarioksa
Copy link
Contributor Author

Closed with PR #136

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant