Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpretation of Results #45

Closed
owbarber opened this issue Oct 24, 2023 · 3 comments
Closed

Interpretation of Results #45

owbarber opened this issue Oct 24, 2023 · 3 comments

Comments

@owbarber
Copy link

owbarber commented Oct 24, 2023

I have enjoyed using geNomad and find it to be a very useful tool. When geNomad identifies a plasmid or virus on a particular contig, is it saying that entire contig likely makes up the plasmid? Because the annotated genes cover the length of the contig, so I wanted to make sure I am interpreting this correctly.

Is there additional documentation on the significance of assigning the three types of topology to plasmids in particular? I was told DTR plasmids are perhaps more likely to be closed than ITR, but it would be helpful to have some documentation or links to information about interpreting the topology.

Finally, is there a way to identify where geNomad found the direct or inverted terminal repeats in a contig?

Thank you!

@apcamargo
Copy link
Owner

I have enjoyed using geNomad and find it to be a very useful tool. When geNomad identifies a plasmid or virus on a particular contig, is it saying that entire contig likely makes up the plasmid? Because the annotated genes cover the length of the contig, so I wanted to make sure I am interpreting this correctly.

Yes, you are right. Contigs classified as plasmids are most often entirely plasmidial. There are some unsual cases where integrative and conjugative elements (ICEs) will be classified ad plasmids if the flanking host region is small. In the case of viruses, if flanking host regions are detected, geNomad will extract the virus region and present it as a provirus.

Is there additional documentation on the significance of assigning the three types of topology to plasmids in particular? I was told DTR plasmids are perhaps more likely to be closed than ITR, but it would be helpful to have some documentation or links to information about interpreting the topology.

DTRs are an indicative that a given sequence is complete because they represent an assembly artifact that assembler will leave when generating contigs from circular chromosome or concatermers (you can read more about it here: https://www.nature.com/articles/s41598-017-07910-5). ITRs are not necessarily an indication that a given contig is complete. It is know that some viruses have biological ITRs at the edges of their genomes, so ITRs can be informative. But unless you have an a priori expectation that your genome should possess ITRs if complete, I wouldn't recommend using them as evidence of completeness.

Finally, is there a way to identify where geNomad found the direct or inverted terminal repeats in a contig?

The DTRs/ITRs will be at least 21 bp long, but geNomad won't tell you the exact length or the coordinates.

@owbarber
Copy link
Author

Thank you so much for all the information. I know some of it is contextual and knowledge that you don't need to offer as the creator of the tool, but I appreciate you offering it so I can better interpret my results.

@apcamargo
Copy link
Owner

No worries! I'll consider adding this to the documentation in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants