Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anova.ccabymargin fails for on a dbrda fit with an interaction term and a Condition #463

Closed
gavinsimpson opened this issue Jan 26, 2022 · 5 comments
Assignees
Labels

Comments

@gavinsimpson
Copy link
Contributor

gavinsimpson commented Jan 26, 2022

Consider

library("vegan")
set.seed(1)
nspp <- 10
vars <- expand.grid(f1 = factor(c("Yes", "No")), f2 = factor(c("Yes", "No")), p1 = factor(paste0("p", 1:5)))
vars <- rbind(vars, vars, vars)
X <- model.matrix(~ f1 * f2 * p1, data = vars)
betas<- matrix(rnorm(ncol(X) * nspp), ncol = nspp)
y <- X %*% betas + rnorm(ncol(X) * 3 * nspp)
names(y) <- paste0("spp", seq_len(nspp))

df <- data.frame(cbind(y, vars))

ord <- dbrda(y ~ f1 * f2 + Condition(p1), data = df)
anova(ord, by = "margin")

This produces

r$> anova(ord, by = "margin")
Permutation test for dbrda under reduced model
Marginal effects of terms
Permutation: free
Number of permutations: 999

Model: dbrda(formula = y ~ f1 * f2 + Condition(p1), data = df)
         Df Variance    F Pr(>F)
f1:f2     0     0.00 -Inf       
Residual 52    12.93

which is clearly incorrect given

r$> ord
Call: dbrda(formula = y ~ f1 * f2 + Condition(p1), data = df)

              Inertia Proportion Rank
Total         27.4016     1.0000     
Conditional    9.1711     0.3347    4
Constrained    5.3007     0.1934    3
Unconstrained 12.9298     0.4719   10
Inertia is mean squared Euclidean distance 

Eigenvalues for constrained axes:
dbRDA1 dbRDA2 dbRDA3 
 3.929  1.041  0.331 

Eigenvalues for unconstrained axes:
  MDS1   MDS2   MDS3   MDS4   MDS5   MDS6   MDS7   MDS8   MDS9  MDS10 
2.7741 2.2651 1.6486 1.4417 1.2276 0.9281 0.8537 0.7784 0.5823 0.4301

The problem comes in anova.ccabymargin because in

## analyse only terms of 'ass' thar are in scope
scopeterms <- which(attr(terms(object$terminfo), "term.labels") %in% trms)

the Condition(p1) term comes before the interaction term in the terminfo / term.labels

r$> attr(terms(ord$terminfo), "term.labels")                                  
[1] "f1"    "f2"    "p1"    "f1:f2"

r$> trms <- drop.scope(ord)                                                   

r$> which(attr(terms(ord$terminfo), "term.labels")  %in% trms)                
[1] 4

and then in

mods <- lapply(scopeterms, function(i, ...)
permutest(ordConstrained(Y, X[, ass != i, drop=FALSE], Z, "pass"),
permutations, ...), ...)

we're subsetting the X model matrix to drop the interaction term but that, of course, doesn't have 4 columns. This doesn't fail though because of the ass != i bit with ass defined as

ass <- object$terminfo$assign

which in this case is given by

r$> ord$terminfo$assign                                                       
[1] 1 2 3

In that lapply() we're subsetting X with

Browse[2]> head(X[, ass != 4, drop = FALSE]) # 4 is from the which(attr(terms(object$terminfo .... )
   f1Yes f2Yes f1Yes:f2Yes
X1   0.5   0.5        0.75
X2  -0.5   0.5       -0.25
X3   0.5  -0.5       -0.25
X4  -0.5  -0.5       -0.25
X5   0.5   0.5        0.75
X6  -0.5   0.5       -0.25

with X being

Browse[2]> head(X)                                                            
   f1Yes f2Yes f1Yes:f2Yes
X1   0.5   0.5        0.75
X2  -0.5   0.5       -0.25
X3   0.5  -0.5       -0.25
X4  -0.5  -0.5       -0.25
X5   0.5   0.5        0.75
X6  -0.5   0.5       -0.25

So we're not dropping the correct column; we're not dropping anything at all here which explains the 0 for df and Variance columns in the output from anova.ccabymargin shown above.

I'm not sure how to fix this without potentially causing other problems were I to mess with the formula parsing or custom terminfo stuff.

Looking at cca.formula I think the same problem will affect it too, but I haven't tested that yet.

@jarioksa
Copy link
Contributor

Confirmed. This is a shorter example using vegan data:

library(vegan)
data(dune, dune.env)
m <- cca(dune ~ A1 * Management + Condition(Moisture), data=dune.env)
anova(m, by = "margin") # wrong answer

This concerns all constrained ordination methods with interaction term in constraints when there is a Condition term, and it seems to occur only in anova(..., by="margin").

jarioksa added a commit that referenced this issue Jan 27, 2022
@jarioksa
Copy link
Contributor

@gavinsimpson commit ac87d1b seems to fix this bug. I made some search in the vegan codebase and did not spot any other pieces where this bug crops out.

@gavinsimpson
Copy link
Contributor Author

@jarioksa That ( ac87d1b plus the tweaks) looks good. We can close this once that's merged. Are you going to backport this to the current cran branch for a release or is there enough change built up on the master branch to warrant a point release to CRAN?

@jarioksa
Copy link
Contributor

jarioksa commented Jan 27, 2022

@gavinsimpson : vegan has been moving pretty slowly recently, and 2.5-* has diverged so much that I want to base the next release (2.6-1) on the current master. There are some issues that need solving, plus a lot of QA & testing, and I'm busy with another package (Hmsc) currently, but I'll try to do my best. I'll write a personal message to you about some issues that need resolution.

jarioksa added a commit that referenced this issue Jan 29, 2022
@jarioksa
Copy link
Contributor

closed with commit 47e9fc3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants