Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it suitable for refinning bins #2

Closed
Microbion opened this issue Jun 5, 2023 · 6 comments
Closed

Is it suitable for refinning bins #2

Microbion opened this issue Jun 5, 2023 · 6 comments

Comments

@Microbion
Copy link

I read the paper. In discussion you mentioned that "COBRA can also be applied to microbial genomes and whole metagenonmes" and "using COBRA only to extend the subset of longer contigs". How about to use COBRA as a binning refiner? Although some refiner has been reported, such as https://github.com/dparks1134/RefineM and https://apcamargo.github.io/magpurify2, purifing bins is difficult in extremely complex microbial community. Maybe COBRA have more performance as a binning refiner.

@linxingchen
Copy link
Owner

Hi, Thank you.

You are right, COBRA is able to do that, and it did well as you can see from Supplementary Figure 18 in the manuscript. The only problem is if the original bins assigned, for example, two contigs from the same genome into two bins, and COBRA could join these two contigs, then COBRA will have a problem assigning the joined contig. I'd love to discuss more about this.

@linxingchen
Copy link
Owner

linxingchen commented Jun 5, 2023 via email

@Microbion
Copy link
Author

Hi @linxingchen. In my opinion, whether a contig belongs to genome A or genome B needs not only overlaps but also other evidences, such as TNF(tetranucleotide frequency) , coverage , phylogenetic assignment and single marker genes. I will use classical refinement tools, like metawrap refinement module at first, then result bins which have low completeness are polished by COBRA (because the reassemble step usually decrease completness, can COBRA be the replacer?).

@linxingchen
Copy link
Owner

Hi @linxingchen. In my opinion, whether a contig belongs to genome A or genome B needs not only overlaps but also other evidences, such as TNF(tetranucleotide frequency) , coverage , phylogenetic assignment and single marker genes. I will use classical refinement tools, like metawrap refinement module at first, then result bins which have low completeness are polished by COBRA (because the reassemble step usually decrease completness, can COBRA be the replacer?).

I have no idea how metawrap works but sounds like there is a reassemble step, could you explain more? When you said "reassemble step usually decrease completeness", I thought it was a step of assembling the mapped reads.

For a given bin, COBRA will join the contigs by their validate end overlap (maxK or maxK-1), sometimes using very small contigs that may not be in the bin, to get longer sequences. So COBRA does not use mapped reads for reassembly.

@Microbion
Copy link
Author

Yes, the reassemble step is to used PE reads mapped to one bin. We used this after refinement module, but usually got a poor result acoording to checkm. So I'd like to replace the reassemble step by COBRA for bins quanlity improvement, although they have distinct different algorithms.

@linxingchen
Copy link
Owner

Yes, the reassemble step is to used PE reads mapped to one bin. We used this after refinement module, but usually got a poor result acoording to checkm. So I'd like to replace the reassemble step by COBRA for bins quanlity improvement, although they have distinct different algorithms.

Right, "reassembly of mapped reads won't get better genomes", that's a lesson we learned years ago. It should be great to use COBRA for that step if the refinement step is good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants