Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when using -g with ensembl gff3 #4

Closed
bwlang opened this issue Dec 31, 2014 · 3 comments
Closed

segfault when using -g with ensembl gff3 #4

bwlang opened this issue Dec 31, 2014 · 3 comments

Comments

@bwlang
Copy link

bwlang commented Dec 31, 2014

Here's the gff3 file: ftp://ftp.ensemblgenomes.org/pub/release-24/plants/gff3/oryza_sativa/Oryza_sativa.IRGSP-1.0.24.gff3

tophat accepted this file without obvious errors.

Here's the backtrace (just downloaded 2.2.1)

Program received signal SIGSEGV, Segmentation fault.
0x000000378a680b51 in __strlen_sse2 () from /lib64/libc.so.6
(gdb) bt
#0  0x000000378a680b51 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x0000000000542cfd in GffObj::parseAttrs (this=0x8b7b00, atrlist=@0x8b7bf0, info=0x0, isExon=false) at gff.cpp:1379
#2  0x0000000000544155 in GffObj::addExon (this=0x8b7b00, reader=<value optimized out>, gl=0x8b3c40, keepAttr=true, noExonAttr=true) at gff.cpp:522
#3  0x00000000005452d0 in GffObj::GffObj (this=0x8b7b00, gfrd=0x7fffffffc020, gffline=0x8b3c40, keepAttr=true, noExonAttr=true) at gff.cpp:827
#4  0x0000000000545aa2 in GffReader::newGffRec (this=0x7fffffffc020, gffline=0x8b3c40, keepAttr=<value optimized out>, noExonAttr=true, parent=0x0,
    pexon=0x0, glst=0x0) at gff.cpp:999
#5  0x00000000005464dc in GffReader::readAll (this=0x7fffffffc020, keepAttr=true, mergeCloseExons=true, noExonAttr=true) at gff.cpp:1228
#6  0x000000000054fb22 in read_transcripts (f=0x88d320, seqdata=..., crc_result=..., keepAttrs=true) at gtf_tracking.cpp:600
#7  0x00000000004a8c4a in load_ref_rnas (ref_mRNA_file=0x88d320, rt=..., ref_mRNAs=..., gtf_crc_result=..., loadSeqs=false, loadFPKM=false)
    at bundles.cpp:104
#8  0x000000000040f014 in driver (hit_file_name=<value optimized out>, ref_gtf=0x88d320, mask_gtf=0x0) at cufflinks.cpp:1623
#9  0x000000000042477a in main (argc=14, argv=<value optimized out>) at cufflinks.cpp:1799
@gpertea
Copy link
Collaborator

gpertea commented Jan 14, 2015

That GFF3 file seems badly formatted, the essential attribute "Parent" is spelled "PARENT" which is incorrect (these attributes are case sensitive according to the GFF3 specification).
Simply transforming the file with something like perl -pe 's/(NAME|PARENT)=/\L\u$1=/g' should fix this. I suspect TopHat just failed silently at loading that annotation (i.e. no transcript features were actually loaded).

@gpertea gpertea closed this as completed Jan 14, 2015
@bwlang
Copy link
Author

bwlang commented Jan 20, 2015

Sounds like it's a bad GFF file (shame on ensembl...) but I don't think a segfault is the best response.
Perhaps an error message with the location in the input would head off this kind of question in the future?

@gpertea
Copy link
Collaborator

gpertea commented Jan 20, 2015

I fully agree. I have added some code checking for situations like this, hopefully it's going to make its way in to the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants