Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot allocate vector of size ... #3

Closed
g-antonello opened this issue May 17, 2022 · 5 comments
Closed

Cannot allocate vector of size ... #3

g-antonello opened this issue May 17, 2022 · 5 comments

Comments

@g-antonello
Copy link

g-antonello commented May 17, 2022

Dear Huijuan,
I'm happy about this tool for many reasons, first of all because it is fast, compared to other ones (DESeq2, corncob). However, I am having an issue with a differential abundance run on 172 samples and 781 SGBs from MetaPhlAn (each sample sums to 100):

linda(otu.tab = abundances(phyloseq)), meta = meta(phyloseq), formula = "~ age + sex + using_drugs + trait_of_interest", type = "proportion", adaptive = TRUE)
This outputs: Error: cannot allocate vector of size 9.6 Gb.

Is this really that memory-intensive? Or am I doing something wrong?

The variables values are:
* age: integer
* sex: binary
* using_drugs: binary
* trait_of_interest: integer taking values 1, 2 or 3

@zhouhj1994
Copy link
Owner

zhouhj1994 commented May 18, 2022 via email

@g-antonello
Copy link
Author

g-antonello commented May 19, 2022

Dear Huijuan,
Thank you for your detailed reply. I guess it was a variable encoding issue, i don't know.. because when I started from scratch with a cleaner code it worked, Thank you for this!
Another point is: since metaphlan returns counts summing to 100, these are proportions, which I made sure they were accounted for by each sample's proportions sum to 1 and then using type = "proportion" in linda.

  1. Is LinDA's approach still valid with this setup, even if it can't leverage on sequencing depth?
  2. Do you think it is a valid approach to include the number of reads mapped per sample as a covariate, to mimic what LinDA would do internally with counts?

Best,
Giacomo

@zhouhj1994
Copy link
Owner

zhouhj1994 commented May 20, 2022 via email

@g-antonello
Copy link
Author

Thank you for the insight! Is the model's performance (accuracy/power/...) good without imputing zeros with the default method for counts data?

Giacomo

@zhouhj1994
Copy link
Owner

zhouhj1994 commented May 21, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants