Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why is precision so low? #65

Closed
gregcaporaso opened this issue Jun 19, 2014 · 4 comments
Closed

why is precision so low? #65

gregcaporaso opened this issue Jun 19, 2014 · 4 comments

Comments

@gregcaporaso
Copy link
Member

related to #62?

@gregcaporaso gregcaporaso added this to the paper-submission milestone Jun 19, 2014
@gregcaporaso
Copy link
Member Author

Likely contributors:

  • all comparisons are at the genus level
  • we haven't filtered singletons

@gregcaporaso
Copy link
Member Author

After some investigation on this, it looks like there are a few issues:

  1. We're counting assignment as Other as a false positive. We shouldn't, as it effectively means that the classifier didn't make an assignment at that level. That's not a precision hit, but rather should be a recall hit if the group is never identified.
  2. After taking care of (1), most of the remaining false positions are low abundance (relative abundance of 1e-5 or lower). So we need to do some low abundance OTU filtering.
  3. There are some annotation errors. The Broad mock communities list k__Bacteria;p__Thermi;c__Deinococci;o__Deinococcales, but the RDP classifier identifies k__Bacteria;p__[Thermi];c__Deinococci;o__Deinococcales (note the square brackets around Thermi).

@gregcaporaso
Copy link
Member Author

Another error detected in Turnbaugh mock communities. These are listed as containing k__Bacteria;p__Firmicutes;c__Coriobacteriia;o__Coriobacteriales, but uclust assigns k__Bacteria;p__Actinobacteria;c__Coriobacteriia;o__Coriobacteriales.

Remaining false positives are low abundance (all <= 2e-4 relative abundance).

@gregcaporaso
Copy link
Member Author

These issues have all been addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant