We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I just found the answer:
coordinates: 1-indexed coordinates of the provirus region within host sequences. Will be NA for viruses that were not predicted to be integrated.
Normal prediction result.
seq_name length topology coordinates n_genes genetic_code virus_score fdr n_hallmarks marker_enrichment taxonomy -------------------- ------ -------- ------------- ------- ------------ ----------- --- ----------- ----------------- -------------------- CP000388.1|provirus_ 52034 Provirus 774617-826650 58 11 0.9401 NA 11 45.1836 Viruses; 774617_826650 Duplodnaviria; Heunggongvirae; Uroviricota; Caudoviricetes
Prediction without coordinates (genome is GCA_000166735.2, a MAG).
seq_name length topology coordinates n_genes genetic_code virus_score fdr n_hallmarks marker_enrichment taxonomy -------------- ------ ------------------- ----------- ------- ------------ ----------- --- ----------- ----------------- -------------------- AEMJ01000831.1 698 No terminal repeats NA 1 11 0.9638 NA 0 1.7183 Viruses; Duplodnaviria; Heunggongvirae; Uroviricota; Caudoviricetes AEMJ01000737.1 1746 No terminal repeats NA 2 11 0.9244 NA 0 1.7183 Viruses; Duplodnaviria; Heunggongvirae; Uroviricota; Caudoviricetes AEMJ01000706.1 826 No terminal repeats NA 1 11 0.8908 NA 0 1.4495 Viruses; Duplodnaviria; Heunggongvirae; Uroviricota; Caudoviricetes AEMJ01000847.1 3369 No terminal repeats NA 2 11 0.8785 NA 0 1.7183 Viruses; Duplodnaviria; Heunggongvirae; Uroviricota; Caudoviricetes AEMJ01000526.1 288 No terminal repeats NA 2 11 0.8497 NA 0 0.0000 Unclassified AEMJ01000792.1 2672 No terminal repeats NA 3 11 0.8414 NA 0 1.7183 Unclassified AEMJ01000320.1 1885 No terminal repeats NA 2 11 0.8297 NA 0 1.7183 Unclassified AEMJ01000546.1 283 No terminal repeats NA 2 11 0.8262 NA 0 0.0000 Unclassified AEMJ01000712.1 349 No terminal repeats NA 2 11 0.8238 NA 0 0.0000 Unclassified
$ seqkit stats GCA_000166735.2.fna.gz file format type num_seqs sum_len min_len avg_len max_len GCA_000166735.2.fna.gz FASTA DNA 893 2,298,088 101 2,573.4 82,336 $ seqkit seq -n GCA_000166735.2.fna.gz | head -n 3 AEMJ01000893.1 UNVERIFIED_ORG: Leuconostoc inhae KCTC 3774 contig00909, whole genome shotgun sequence AEMJ01000892.1 UNVERIFIED_ORG: Leuconostoc inhae KCTC 3774 contig00908, whole genome shotgun sequence AEMJ01000891.1 UNVERIFIED_ORG: Leuconostoc inhae KCTC 3774 contig00907, whole genome shotgun sequence
The text was updated successfully, but these errors were encountered:
Good that you already found the answer :)
As a note, be careful with those very short sequences. In particular with the ones without markers. I usually require a hallmark for the short ones.
Sorry, something went wrong.
I forgot to say that I used the --relaxed flag.
--relaxed
Thanks for the message. :)
No branches or pull requests
I just found the answer:
Normal prediction result.
Prediction without coordinates (genome is GCA_000166735.2, a MAG).
The text was updated successfully, but these errors were encountered: