-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DECIPHER classification module compared to dada2 classification #683
Comments
...I used the DECIPHER R chunk in the tutorial to get the first set of bar plots... |
From glancing between them, I would say the difference is big but also narrow: Lots of sequences are being assigned to Muribaulaceae by I'm not sure why that would be, but @digitalwright might have some insight. It is true that in general, DECIPHER will be a bit more conservative about assigning taxonomy than default |
Note that DECIPHER is also assigning some sequences to Muribaculaceae, just a whole lot less than assignTaxonomy does. But I will take it up with @digitalwright as well. |
IDTAXA tends to leave more sequences unclassified at the root level. You can read about that in the section "IDTAXA’s classifications change the interpretation of microbiome data" of their paper. Figure 4 illustrates this behaviour. https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-018-0521-5 |
A few more points:
Points 1 - 3 above are made in the paper that @apcamargo mentioned. I hope that helps. |
Thanks to all, your comments have been so helpful. |
I'm attaching two versions of the bar plot that is generated at the end of the dada2 tutorial using the tutorial dataset.
I used the DECIPHER R chunk in the tutorial to get the plot and the second R chunk was generated using the dada2 R chunks to identify taxa using the Silva 132 reference files.
As you can see there is a great deal of difference between the two files. Any ideas on how to trouble shoot? It could be a problem with the R chunk in the tutorial, it could be a problem with the .RData file and the way it was created, it could be a problem with DECIPHER.
If you run this using the latest versions of dada2 and phyloseq (I'm running those, and those are newer than the tutorial), can you replicate the same difference?
If it is not the Rchunk in the tutorial, then it is either the .RData file or something inherent in DECIPHER. If that is the case, can I copy you on my correspondence with the DECIPHER people?
The text was updated successfully, but these errors were encountered: