-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert_gff3_to_gbk.py, add full support for non-protein-coding genes #24
Comments
Came to post an issue but I think I'm having the same problem noted above, so will just add a concrete example of why this is a problem. I am trying to extract 16S sequences that are annotated in a GenBank file (example). The fact that a gene is the 16S sequence is identified by the product name in the GenBank file,
However, the product name doesn't make it into the GFF3 file and so it is impossible to select the 16S sequences downstream separately from other rRNA's,
|
I'll see if I can get this added tonight. |
Last night has shifted into today. |
@mikemc Is it possible to attach your GBK file so I can test with it, or is it private? |
@jorvis The example I gave is from this GenBank file |
@mikemc - The current version of the code should fix your issue. The tRNAs now export with anticodon reported and rRNAs with product. I'm not closing this ticket yet, as what @jonathancrabtree reported is actually the reverse conversion, going from GFF3 -> GBK. |
Closing. I've now confirmed retention of annotation of tRNAs and rRNAs from source Genbank Flat file, converted to GFF3, then converted back to Genbank. |
Great, thanks @jorvis! I haven't had a chance to test yet but sounds like this covers my issue. |
If convert_gff3_to_gbk.py finds a tRNA, rRNA, or other non protein-coding gene in the input GFF3 it will output the parent "gene" feature in the output GenBank file, but nothing else. Only protein-coding genes with an mRNA feature below the parent gene appear to be converted fully. It looks like biocodegenbank.print_biogene needs to be generalized to handle all gene types, or at least all those that currently have a corresponding representation in the biothings module.
The text was updated successfully, but these errors were encountered: