Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phylolm() obscure errors when a predictor has unused levels #72

Open
AMBarbosa opened this issue Jun 26, 2024 · 2 comments
Open

phylolm() obscure errors when a predictor has unused levels #72

AMBarbosa opened this issue Jun 26, 2024 · 2 comments

Comments

@AMBarbosa
Copy link
Contributor

AMBarbosa commented Jun 26, 2024

phylolm() fails with obscure error messages (very hard to debug) when a predictor has no variation in the provided (sub)dataset, or even when a predictor is a factor with fewer than all possible levels. Here's some reproducible code based on help file examples:

# create example data:

set.seed(16)
tre = rcoal(60)
taxa = sort(tre$tip.label)
b0=0; b1=1
(x <- rTrait(n=1, phy=tre, model="BM",
            parameters=list(ancestral.state=0, sigma2=10)))
(xd <- rTraitDisc(phy=tre, model="ER", k = 3))
unique(xd)  # 2 levels out of 3
y <- b0 + b1*x + rTrait(n=1, phy=tre, model="lambda",
                        parameters=list(
    ancestral.state=0, sigma2=1, lambda=0.5))
dat = data.frame(trait=y[taxa], x=x[taxa], xd=xd[taxa])
str(dat)


# model:

fit = phylolm(trait ~ x + xd, data=dat, phy=tre, model="lambda")

# Error in solve.default(comp$XX) :
# Lapack routine dgesv: system is exactly singular: U[3,3] = 0

# or some other times:
# Error in solve.default(comp$XX) : 
# system is computationally singular: reciprocal condition number = 2.12258e-17
# [or another number]

Below are some possible solutions that I've been using in my scripts:

# drop unused factor levels:

dat <- droplevels(dat)
str(dat)
fit1 = phylolm(trait ~ x + xd, data=dat, phy=tre, model="lambda")


# character instead of factor for discrete predictors:

dat$xd <- as.character(dat$xd)
str(dat)
fit2 = phylolm(trait ~ x + xd, data=dat, phy=tre, model="lambda")


all.equal(fit1, fit2)

Incorporating these (or equivalent) solutions in the function, or at least a more informative error message, could spare the users from countless hours of debugging.

Weirdly, I found some cases where I still get a phylostep() error after using droplevels(), but not after using as.character() for factors instead -- let me know if you want me to send you my data for reproducing this.

Regards,

@lamho86
Copy link
Owner

lamho86 commented Jul 17, 2024

Hello @AMBarbosa,

Thanks for the notice. I prefer the drop level solution but you indicated that there is a problem. Could you please send me your data and a reproducible code to check?

@AMBarbosa
Copy link
Contributor Author

Will do, if I eventually find it again... In most cases, droplevels() fixes it, together with the exclusion of non-varying predictors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants