GFF: parse_simple fails in some cases #80

Closed
khughitt opened this Issue Nov 11, 2013 · 1 comment

Comments

Projects
None yet
2 participants
Contributor

khughitt commented Nov 11, 2013

To reproduce:

from BCBio import GFF

# http://www.broadinstitute.org/annotation/gebo/help/data/gff3/transcripts.gff3
input_file = 'transcripts.gff3'

list(GFF.parse_simple(open(input_file)))

Error output:

KeyError                                  Traceback (most recent call last)
<ipython-input-10-32ebcd95dc0c> in <module>()
----> 1 list(GFF.parse_simple(open(infile)))

/home/keith/software/bcbb/gff/BCBio/GFF/GFFParser.pyc in parse_simple(gff_files, limit_info)
    721     parser = GFFParser()
    722     for rec in parser.parse_simple(gff_files, limit_info=limit_info):
--> 723         yield rec["child"][0]
    724 
    725 def _file_or_handle(fn):

KeyError: 'child'

I checked the results of parser.parse_simple(gff_files, limit_info=limit_info) and there are some parent entries that have no child key.

E.g. For the above file:

[{'parent': [{'id': 'newGene',
    'is_gff2': False,
    'location': [499, 2610],
    'quals': {'ID': ['newGene']},
    'rec_id': 'edit_test.fa',
    'strand': 1,
    'type': 'gene'}]},
 {'child': [{'id': 't1',
    'is_gff2': False,
    'location': [499, 2385],
    'quals': {'ID': ['t1'],
     'Name': ['t1(newGene)'],
     'Namo': ['reinhard+did+this'],
     'Parent': ['newGene'],
     'uri': ['http://www.yahoo.com']},
    'rec_id': 'edit_test.fa',
    'strand': 1,
    'type': 'mRNA'}]},
   ...
]

If you want to treat the parent and child nodes the same, a simple fix would be:

yield rec.get('child', rec.get('parent'))[0]

Hopefully this time it is an actual issue and not just a misunderstanding on my part :)

If the above solution is appropriate, I would be glad to submit a patch.

@chapmanb chapmanb closed this in 75e0078 Nov 13, 2013

Owner

chapmanb commented Nov 13, 2013

Keith;
Thanks for reporting this problem. Your solution is exactly right and I generalized it a bit to also handle directive lines and push the fix. Thanks again for catching this one and let me know if you run into any other problems at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment