-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] dnaA truncated by PGAP gene calling #252
Comments
Thank you, Tue, for your report. We will investigate this issue in internal ticket. |
Agreed. Thanks! |
Could you please post some of the input genomes? |
I can 't post them here unfortunately, as they are "embargoed" but I can send a safe download link to your email address? Do you want only the ones with /pseudo or also some of the ones with the same gene calling but not /pseudo on dnaA? |
Sure.
The more examples the better. Thanks! |
Great, I'll collect and send you a link to the genomes tomorrow wednesday.
Best,
Tue
tir. 4. apr. 2023 kl. 18.42 skrev Azat Badretdin ***@***.***>:
… as they are "embargoed" but I can send a safe download link to your email
address?
Sure.
Do you want only the ones with /pseudo or also some of the ones with the
same gene calling but not /pseudo on dnaA?
The more examples the better.
Thanks!
—
Reply to this email directly, view it on GitHub
<#252 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDPVIJZFJWUUZA3GXDLVI3W7RFPTANCNFSM6AAAAAAWSRDGOM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Looking forward to it, Tue! |
I got the genomes, Tue, thanks! |
Unfortunately, they post output data, not the input data. We need input FASTA files. |
Dear Azat, |
Thanks, Tue! Not yet. So far I got only the original package from 4/5. The data goes through a different group, they will notify us when it comes. |
Ok. I sent a link to the fasta files on Tuesday so hopefully they will make their way to you soon. |
Tue, judging by the output you sent us in the first tarball, it looks like you did not use standalone PGAP for these annotation, but GenBank submission service. Could you please confirm? |
yes, that is correct, the annotation was performed at NCBI, not by the standalone PGAP CLI. |
Thank you for confirming, Tue. That explains the confusion. |
Hi again, The genomes in PGAP input format (fasta w info in header) I sent weren't received, do you want me to reupload them, and how do I get the download link to you if the genomes@ncbi.nlm.nih.gov is not a good channel? Tue |
Since they were submitted via Genbank, we have the input data already. Thanks! |
A quick followup on this, in case anyone stumbles on this bug report: the dnaA genecalling was modified/improved, which resolved all the observed issues by identifying the complete dnaA genes rather than partial genes. |
Thank you, Tue! |
Describe the bug
Depositing some bacterial genomes, I noticed that 9 of them come back with a /pseudo tag on the dnaA gene after PGAP annotation. They all have this field
/note="incomplete; partial on complete genome; missing
N-terminus; Derived by automated computational analysis
using gene prediction method: Protein Homology."
However these genes were not truncated in a prodigal annotation of the genes, and the closest database reference determined by autoMLST also has the full dnaA sequence (see AA alignment below).
For most of the strains with the /pseudo tag, Streptomyces niveus strains have the highest %ANI from autoMLST (Streptomyces_niveus_GCF_002009175, Streptomyces_niveus_NCIMB_11891_GCF_000497425, )
The submitted genomes are not yet publicly available but I can supply you with the genbank files if necessary.
Since this is annotation performed at NCBI, I don't have the log files and software versions requested in the bug report form.
The issue in 9 genomes is not consistent as many of the Streptomyces genomes I deposited do not have the /pseudo tag.
Best
Tue Sparholt Jørgensen
The text was updated successfully, but these errors were encountered: