Trying to filter phyloseq object by prevalence and relative abundance simultaneously #1555

mirpie · 2022-02-21T18:39:36Z

Hello!

I'm trying to implement a custom taxa filtering method that retains OTUs meeting the following criteria:
either prevalence of read count > 10 in more than 50% of samples, OR mean relative abundance of >0.1% and prevalence of read count > 10 in more than 10% of samples.

Given this criteria relies on both relative abundance and absolute count data, I've set the code for filter_taxa() up as follows:

dat_pr_high = filter_taxa(dat_pr_OBIT, function(x){(sum(x > 10) > nsamples*0.5) | ((sum(x > 10) > (nsamples*0.1)) & (mean(x/sum(x)) > 0.001))}, prune = T)

However, this doesn't seem to work, as the phyloseq object I get back contains taxa with low prevalence (only present in 35 samples) and a mean relative abundance < 0.001 (0.0003).
I know I can transform the phyloseq object to relative abundance using transform_sample_counts(), but I don't want to do this as I need to retain the raw counts for downstream analysis. Furthermore, I need to glom taxa later on, which produces a mismatch in the ASV labels between the relative and absolute otu tables (due to selection of different archetypes in tax_glom())

If someone could point out my mistake I would greatly appreciate it!

The text was updated successfully, but these errors were encountered:

mirpie · 2022-02-21T19:31:20Z

solved! realized I was normalizing using the total count of each taxa, not of each sample. simple sub in mean(x/sample_sums(dat_pr_OBIT)) for mean(x/sum(x)) in the above code :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to filter phyloseq object by prevalence and relative abundance simultaneously #1555

Trying to filter phyloseq object by prevalence and relative abundance simultaneously #1555

mirpie commented Feb 21, 2022 •

edited

mirpie commented Feb 21, 2022

Trying to filter phyloseq object by prevalence and relative abundance simultaneously #1555

Trying to filter phyloseq object by prevalence and relative abundance simultaneously #1555

Comments

mirpie commented Feb 21, 2022 • edited

mirpie commented Feb 21, 2022

mirpie commented Feb 21, 2022 •

edited