Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No taxa observed in all samples? error #14

Closed
nmshahir opened this issue Aug 13, 2018 · 6 comments
Closed

No taxa observed in all samples? error #14

nmshahir opened this issue Aug 13, 2018 · 6 comments

Comments

@nmshahir
Copy link

Hello Dr. Willis,

I was fiddling around with Divnet and my phyloseq dataset and attempted to run it at the ASV level as well as the Genus level. I both cases I obtained the following error

Error in pick_base(W) : Yikes! No taxa observed in all samples!
Pick which taxon is to be the base

First, how would one go about picking a taxon to be the base and second, why do I need one?

Best,
Nur Shahir

@adw96
Copy link
Owner

adw96 commented Aug 13, 2018

Hi Nur! Thanks for your question. The model that DivNet fits requires a "denominator" taxon, against which other taxon abundances are to be compared. The effect of the choice of taxon on your estimates is tiny -- here's how robust it is to your choice:
screen shot 2018-08-13 at 10 28 37 am
I would recommend choosing a medium-abundance taxon (to make it not arbitrary, perhaps the taxon with median abundance across all samples -- i.e. rank your taxa by their abundance and choose the middle one), but try a couple of different ones to confirm that your estimates don't change much. I doubt they will :)

I hope that helps!

Amy

@nmshahir
Copy link
Author

It does! I've tested it out on my data and the estimates don't change by much, but the error does depending on the base I choose. This slightly problematic when doing statistics since the p-values are dependent upon the error bars,
The data I have comes from patients and not all taxa at the RSV level (and even the genus level) are seen across all of my samples, should I pick my base from the taxa that are across at least 50% or 75% or some X% of the patients?

Thank you,
Nur

@adw96
Copy link
Owner

adw96 commented Aug 20, 2018

I would recommend using a base taxon that reduces the amount of variability (i.e. the most variance stabilising base taxon). In general I would expect a taxon observed in more samples to be the most stable, but this isn't always the case. I'm hesitant to set a hard cutoff for what percentage of samples it should be observed in, so instead I recommend trying a few different taxa and using one that gives you the consistent results with many other taxa.

I'm going to update the documentation for base to provide this guidance and then close this issue. Please reopen if you find a bug in the code :)

@adw96 adw96 closed this as completed in d04b975 Aug 20, 2018
@jvhagey
Copy link

jvhagey commented Aug 22, 2018

Just for clarification Amy can you add to the documentation for divnet what taxa index is? Is that the row number that the ASV is found in the otu_table for the phyloseq object? or is the index something else? I am working to sort out the same problem. Thanks!

@adw96
Copy link
Owner

adw96 commented Aug 23, 2018

Thanks for your comments, everyone! I definitely appreciate that this wasn't clear. Originally this was the matrix index of the taxon -- no idea why I thought this was clear. To make this easier, I've also added the functionality to give the name of a taxon -- this prevents ambiguities matrix ordering. So, for example, you can now run:

data(Lee)
library(magrittr)
lp <- Lee %>% tax_glom("Phylum")
divnet(lp, base = "ASV_91")

The documentation for base now reads

base The column index of the base taxon in the columns of W, or the name of the taxon (must be a column name of W, or a taxon name if W is a phyloseq object). If NULL, will use pick_base to choose a taxon. If no taxa are observed in all samples, an error will be thrown. In that case, we recommend trying a number of different highly abundant taxa to confirm the results are robust to the taxon choice.

Thanks everyone for your input! I'm very happy to update the documentation to help, so thanks for your feedback!

Reopening until I've merged pull request to master branch TODO(Amy)

@adw96 adw96 reopened this Aug 23, 2018
@adw96
Copy link
Owner

adw96 commented Aug 23, 2018

closed with dbb4e77

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants