Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing gene id after converting gff to gtf #45

Closed
ze236789 opened this issue Jan 20, 2020 · 2 comments
Closed

missing gene id after converting gff to gtf #45

ze236789 opened this issue Jan 20, 2020 · 2 comments

Comments

@ze236789
Copy link

Hi,

I am using the gffread v0.11.6 and the command of
./../tool/gffread/gffread -T sample4_prokka_35.gff -o sample4_prokka_35.gtf
to convert a prokka generate gff file to gtf, but found some transcript does not have gene_id.
Is there a way to solve this problem as it affect the downstream analysis. It was doing good when I used a NCBI downloaded gff file.

Attached are my gff and generated gtf files.
sample4_prokka_35.gtf.txt
sample4_prokka_35.gff.txt

Thanks a lot.

@gpertea
Copy link
Owner

gpertea commented Jan 21, 2020

That is a severely malformed GFF you have there, the Parent attribute is missing for all sub-features (CDS should normally be parented by a transcript or gene features) and in some cases the attributes column (9th) is enclosed by double quotes (why?!) and even so, some of those CDS features (I suppose each of them is a gene, since this is a prokaryotic organism) simply do not have a gene name or any gene ID among the attributes, e.g.:

ID=LBPNIFED_02030;inference=ab initio prediction:Prodigal:2.6;locus_tag=LBPNIFED_02030;product=hypothetical protein

The output you obtained is probably missing some of those malformed entries (with double-quoted attributes) and gffread cannot add a gene_id unless some attributes named "gene" "geneID" or "gene_name" are found in the input GFF record..

@ze236789
Copy link
Author

ze236789 commented Jan 21, 2020 via email

@gpertea gpertea closed this as completed Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants