Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it better to run geNomad on single genome assemblies? #26

Closed
joelwwh opened this issue Jul 10, 2023 · 1 comment
Closed

Is it better to run geNomad on single genome assemblies? #26

joelwwh opened this issue Jul 10, 2023 · 1 comment

Comments

@joelwwh
Copy link

joelwwh commented Jul 10, 2023

I was wondering if there is a difference between

(1) Run geNomad 1000 times, one at a time per WGS assembly for 1000 genomes
or
(2) Run geNomad 1 time, by combine the 1000 WGS assemblies into a single fasta file

Is there a d ifference in how it works? Will the option (2) be treated like a 'metagenome' and hence run with different parameters? I have personally run (2) but am concerned if the accuracy is affected

Also, if a contig is identified as virus (either prophage or non-integrated), will geNomad only take the part it thinks is viral or will it just give the whole contig? I have noticed that sometimes the 'coordinates' is NA and it simply just gave me the entire bacteria contig unchanged.

@apcamargo
Copy link
Owner

Option (2) would probably be faster, as long as your hardware can process that much data. Also, if you are using score calibration (--enable-score-calibration), you need a minimum of 1,000 per run. I recommend option (2).

If you have sequences with NA in the coordinates field of the _virus_summary.tsv file, it means that geNomad found viral sequences without host segments. It could be: (1) a non-integrated virus, (2) part of a provirus, without a host segment, (3) a provirus with a host segment that was not detected by geNomad.

Detected proviruses will have coordinates in the coordinates field and geNomad you provide you the sequence with the host regions removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants