Replies: 2 comments
-
Hi, that is a good question worth discussing, so i converted it into a discussion. However, the exact cutoffs may likely be a matter of opinion as well as depending on the individual analyses. Just out of my head i would recommend to drop at least every bin with 20% contamination or more (at least after deducting "strain heterology"), as those bins are obviously more than suboptimal. At the very least, the bin should be double checked for signs of contamination from very closely related species that mdmcleaner can miss (some options for that are planned in upcoming mdmceaner versions). Definitively i would not consider working any further with bins showing more than 20% contamination. The completeness is of course another question, and far more dependent on the individual research question. In most cases it will seem like there is not much worth in genomes representing less than 50% of the target genome. However, if you are looking for a specific pathway, any completeness range that includes this pathway together with any conserved marker may well still be worth the effort. |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for the answer, |
Beta Was this translation helpful? Give feedback.
-
Hi,
Following issue 29, would you suggest a maximum % of contamination and/or minimum % completeness (i.e. as determined by checkm for example or any other tool) in which cases it would not be worth running MDMcleaner?
For example, I imagine that a MAG with 80% of contamination will never get to 10% after running MDMcleaner so I would probably skip it.
Best
Greg
Beta Was this translation helpful? Give feedback.
All reactions