-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filtering low abundance Taxa #26
Comments
Thanks for the quick response, The thing is that in some cases I also have ASVs, that seem "truly" abundant in one group, but absent on the other one. For example: ASV/ /P1_1/ P1_2/ P1_3/ P1_4/ P1_5/ ASV/ /P2_1/ P2_2/ P2_3/ P2_4/ P2_5 And thus I'm worried that setting a smaller value to zero_cut, would remove this ASVs from the analysis too. Because of this I was thinking to apply some filtering step, something like prune ASV that in total have less that 80 counts, to filter those Low abundant ASVs like ASV_2 or ASV_726, but keeping ASVs like ASV_29. But I don't know if it would be better to apply this filter before or after the analysis with ancombc, because I'm worried that filtering these elements prior to the analysis would interfere in the normalization process. In the original phyloseq object I have 1303 ASVs, but If I remove ASVs with counts < 80 counts for all samples, I keep 437 ASVs. Thanks in advance |
I also wonder if filtering abundant taxa would interfere in the normalization process, in the case that I also wanted to filter elements with incomplete taxonomies that account for an important part of the counts, in the case that a wanted to repeat the analysis at higher taxonomic levels(genus, family, order,etc). I'm new new to the CoDa and ANCOM-BC paradigm, so any help is very appreciated! Thanks in advance |
Thank you for your great suggestion, @sarpiens ! Yes, I think it makes a lot of sense to filter ASV by its total observed abundance. So far it can be done in the data-preprocessing step, for example, QIIME2 has the corresponding filtering steps when you generate the feature table (ASV/OTU table) from raw sequencing data (fastq) files. We will have that feature available in the ANCOMBC function in the next update. For your second question, yes, theoretically, filtering taxa will not affect the following normalization step. Best, |
Thanks a lot! |
Hello,
I have a count table of ASV with two populations: P1 and P2 (5 samples/group), with-out filtering ASVs by abundance:
out = ancombc(phyloseq = pseq_A1_raw, formula = "population",
p_adj_method = "holm", zero_cut = 0.90, lib_cut = 1000,
group = "population", struc_zero = TRUE, neg_lb = TRUE, tol = 1e-5,
max_iter = 100, conserve = TRUE, alpha = 0.05, global = FALSE)
However, I find some ASVs with a significant q_val, but when a see the corresponding row in the count table show very low differences, because they are low abundand ASVs.
For example:
P1_1 P1_2 P1_3 P1_4 P1_5 P2_1 P2_2 P2_3 P2_4 P2_5
ASV_2 2 2 11 1 0 0 0 0 0 0
[...]
ASV_726 1 43 0 15 1 0 0 0 0 0
I wonder if it would be okey to apply a filter to remove low abundand taxa? And when should I used before or after doing differential abundance with ancombc?
Thanks in advance
The text was updated successfully, but these errors were encountered: