Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need larger value of 'ncol' as pivoting occurred #452

Open
ZhangDengwei opened this issue Dec 4, 2021 · 10 comments
Open

need larger value of 'ncol' as pivoting occurred #452

ZhangDengwei opened this issue Dec 4, 2021 · 10 comments
Labels

Comments

@ZhangDengwei
Copy link

Hi,

I used adonis2 with the following code

adonis2(df_pathway_f~., bgc_exp_f, permutations = 999, distance = 'bray', by = "margin")

But I got the error
need larger value of 'ncol' as pivoting occurred

I understand this is due to too large size of bgc_exp_f with ~160 columns, could you have any clues on how to fix it. Thanks in advance!

@jarioksa
Copy link
Contributor

jarioksa commented Dec 7, 2021

I cannot guess: the error message is not from vegan, but from an external function that vegan calls (and I don't know which function). Please issue traceback() after getting the error to see where this situation occurs.

@gavinsimpson
Copy link
Contributor

Looks like the error comes from qr.X() https://rdrr.io/github/robertzk/monadicbase/src/R/qr.R

Note sure that helps us much as that gets called in a few placed that would be hit by the call to adonis2()

@jarioksa
Copy link
Contributor

jarioksa commented Mar 1, 2022

It indeed comes from qr.X function that we do not call from adonis2 nor from anova.ccabymargin (it is called from anova.ccabyaxis, but that was not called here). Without proper debugging information this looks too tedious to track. We would need a reproducible example: something we can run and get the error. If that cannot be provided, at least a traceback() to tell us how to reach this error. If we are unable to get the error or even see where it comes from, nothing can done.

I expand a bit: we do not call qr.X() from vegan, but obviously some function that we call from vegan calls an external function that calls a function that ... calls a function that calls qr.X. The best we can do is to find the point where the control moves from vegan to those other functions, and to see if there is something we can do before getting to that event horizon. However, we need a reproducible example.

@Rob-murphys
Copy link

Not meaning to necro this but it only seems to occur when using ´by="margin"` with interactions specified in the formula. For example:

# This will give the error
> adonis2(bray_dismat.norm.no_na ~ caste_phase * species * genus, by = "margin", permutations = 999, data = comb_metadata.sorted.no_na)
Error in qr.X(object$CCA$QR) : 
  need larger value of 'ncol' as pivoting occurred

# This will not give the error
> adonis2(bray_dismat.norm.no_na ~ caste_phase + species + genus, by = "margin", permutations = 999, data = comb_metadata.sorted.no_na)
Permutation test for adonis under reduced model
Marginal effects of terms
Permutation: free
Number of permutations: 999

adonis2(formula = bray_dismat.norm.no_na ~ caste_phase + species + genus, data = comb_metadata.sorted.no_na, permutations = 999, by = "margin")
             Df SumOfSqs      R2      F Pr(>F)    
caste_phase   3    4.402 0.05502 3.8621  0.001 ***
species       4    6.161 0.07701 4.0540  0.001 ***
genus         0    0.000 0.00000   -Inf           
Residual    156   59.271 0.74081                  
Total       167   80.009 1.00000                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

@jarioksa
Copy link
Contributor

No reproducible example. Nothing can be done.

@Rob-murphys
Copy link

Apologies I was not trying to seek help just state my experience. I can give data for a reproducible example tomorrow but for what it is worth it seems to occur for any large dataset.

@jarioksa
Copy link
Contributor

One easy thing that often helps to call traceback() after the error. This prints out the call stack which shows where this error occurred. As I wrote in my previous messages, adonis2 (and anova.cca by="margin") do not call qr.X and therefore I do not know in which function this error crops out. This can sometimes help even without a reproducible example because then we know where to search.

@jarioksa
Copy link
Contributor

jarioksa commented Apr 28, 2023

Now I know where this happens and how we get there. I replaced base::qr.X with a bogus function and got the call stack after error:

> qr.X # bogus function to replace base::qr.X
function(...) stop("hit qr.X")
<environment: namespace:base>
> adonis2(dune ~ ., dune.env, by="margin") # by = "margin" hits bogus qr.X which gives an error
Error in qr.X(object$CCA$QR) : hit qr.X
> traceback()
8: stop("hit qr.X") at #1
7: qr.X(object$CCA$QR)
6: model.matrix.rda(object)
5: model.matrix(object)
4: anova.ccabymargin(object, permutations = permutations, model = model, 
       parallel = parallel, scope = scope)
3: anova.cca(sol, permutations = perm, by = by, parallel = parallel)
2: anova(sol, permutations = perm, by = by, parallel = parallel)
1: adonis2(dune ~ ., dune.env, by = "margin")

Standard disclaimer: Don't do this at home! – you do not want to replace base::qr.X with a bogus function.

Clearly the calling sequence of vegan functions is adonis2(..., by="margin")anova.ccabymarginmodel.matrix.rda. Therefore the problem should be fixed in model.matrix.rda (and probably also model.matrix.cca has the same problem). I have no idea which conditions generate this error. I tried a few cases, but all where successful in the sense that no error occurred.

The same problem should be hit in anova(..., by = "margin") of rda (and probably cca) results.

We need a test case that produces this error.

@jarioksa
Copy link
Contributor

jarioksa commented Apr 28, 2023

Here a reproducible example:

> adonis2(dune ~ Moisture * Use * A1, dune.env, by = "margin")
Error in qr.X(object$CCA$QR) : 
  need larger value of 'ncol' as pivoting occurred
> m <- rda(dune ~ Moisture * Use * A1, dune.env)
> anova(m, by="margin")  # anova.ccabymargin calls model.matrix
Error in qr.X(object$CCA$QR) : 
  need larger value of 'ncol' as pivoting occurred
> anova(m, by="axis")   # anova.ccabyaxis calls model.matrix
Error in qr.X(object$CCA$QR) : 
  need larger value of 'ncol' as pivoting occurred

Here the core of the problem:

> m
Call: rda(formula = dune ~ Moisture * Use * A1, data = dune.env)

               Inertia Proportion Rank
Total         84.12368    1.00000     
Constrained   78.15147    0.92901   17
Unconstrained  5.97222    0.07099    2
Inertia is variance 
Some constraints or conditions were aliased because they were redundant
## -- clip -- ##
> alias(m, names=TRUE)
[1] "Moisture.Q:Use.Q"    "Moisture.C:Use.Q"    "Moisture.C:A1"      
[4] "Moisture.C:Use.L:A1" "Moisture.Q:Use.Q:A1" "Moisture.C:Use.Q:A1"

Explanation: when you have higher-order interactions, some of these interaction terms become redundant and explain nothing. In this case, the rank of constrained component was 17 leaving 2 degrees of freedom for the residual component. Six interaction terms were aliased, meaning that the full design matrix had 17 + 6 = 23 columns. The data have 20 rows (observations), and with its default settings qr.X can only construct a design matrix of 20 columns, and needs more columns for pivoting (or handling the pivoted redundant variables).

There is an argument (ncol = ) in qr.X which can be set to higher values, but I haven't checked how it fills the aliased columns and will that work with vegan:::anova.cca. There are also other alternatives that we may inspect: Do we need model.matrix in anova.ccabymargin & anova.ccabyaxis? Can we detect these cases and handle these smoothly?

This error will occur in special cases: the number of non-aliased terms must be < n-1 so that there is residual variation left and anova will be run, but the number of aliased terms must be so high that together with non-aliased terms the model matrix has > n columns. If there are ≥ n-1 non-aliased terms, anova.cca will catch this case and use anova.ccanull:

> adonis2(dune ~ matrix(runif(20*19), 20, 19), dune.env, by = "margin")
No residual component

adonis2(formula = dune ~ matrix(runif(20 * 19), 20, 19), data = dune.env, by = "margin")
         Df SumOfSqs R2 F Pr(>F)
Model    19    4.299  1         
Residual  0    0.000  0         
Total    19    4.299  1      

I think it makes no sense to use by = "margin" if you have high-order interactions, because the only marginal term will be the highest order interaction of all predictors in the model. You get that term as the last ("marginal") also with by = "term" with all other terms sequentially as a bonus.

@jarioksa jarioksa added the bug label Apr 28, 2023
jarioksa added a commit that referenced this issue May 6, 2023
model.matrix fails if the full model matrix has more columns than
there are observations, but number of non-aliased parameters was
lower than the number observations. This typically happens with
models with high-order interactions. This was reported for marginal
models in adonis2 (issue #452) but also affects constrained
ordination methods (rda, cca, dbrda) in anova with by="margin" or
by="axis".
@jarioksa
Copy link
Contributor

jarioksa commented May 6, 2023

This is fixed in commit a64ed4e. Please check this and report on success.

The error occurred with two conditions:

  • The rank of predictor variables was < n, and
  • There were in addition so many aliased (redundant) terms that the total number was > n.

Most commonly this happened in models with high-order interactions including factor variables which typically have several redundant combinations of factor levels.

The bug affected adonis2 with by="margin" and constrained ordination methods (rda, cca, dbrda) in anova with by="margin" or by="axis".

Although the bug is now avoided, you usually do not want to have permutations tests by = "margin" with high-order interactions, because the marginal term typically is one of those redundant levels, and no finite P-value can be reported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants