New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missing gene id after converting gff to gtf #45
Comments
That is a severely malformed GFF you have there, the
The output you obtained is probably missing some of those malformed entries (with double-quoted attributes) and gffread cannot add a gene_id unless some attributes named "gene" "geneID" or "gene_name" are found in the input GFF record.. |
Thanks for the reply. I am guessing that I should also ask the software
developer for the annotation if I am doing any mistakes here. Any software
you would recommend for annotating draft bins that can produce nice GFF
files?
Thanks
在 2020年1月20日星期一,Geo Pertea <notifications@github.com> 写道:
… That is a severely malformed GFF you have there, the Parent attribute is
missing for all sub-features (CDS should normally be parented by a
transcript or gene features) and in some cases the attributes column (9th)
is enclosed by double quotes (why?!) and even so, some of those CDS
features (I suppose each of them is a gene, since this is a prokaryotic
organism) simply do not have a gene name or any gene ID among the
attributes, e.g.:
ID=LBPNIFED_02030;inference=ab initio prediction:Prodigal:2.6;locus_tag=LBPNIFED_02030;product=hypothetical protein
The output you obtained is probably missing some of those malformed
entries (with double-quoted attributes) and gffread cannot add a gene_id
unless some attributes named "gene" "geneID" or "gene_name" are found in
the input GFF record..
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#45?email_source=notifications&email_token=AOKNPWN3W2JDTWBTWQNU4O3Q6ZDQ5A5CNFSM4KI6TE7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJOFLIQ#issuecomment-576476578>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOKNPWOWLLKYJ6MGU5WZNCDQ6ZDQ5ANCNFSM4KI6TE7A>
.
|
Hi,
I am using the gffread v0.11.6 and the command of
./../tool/gffread/gffread -T sample4_prokka_35.gff -o sample4_prokka_35.gtf
to convert a prokka generate gff file to gtf, but found some transcript does not have gene_id.
Is there a way to solve this problem as it affect the downstream analysis. It was doing good when I used a NCBI downloaded gff file.
Attached are my gff and generated gtf files.
sample4_prokka_35.gtf.txt
sample4_prokka_35.gff.txt
Thanks a lot.
The text was updated successfully, but these errors were encountered: