-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STARlong terminating after throwing an instance of 'std::out_of_range' #85
Comments
To be clear, this happens also with the released statically compiled binaries, not just those compiled on our machine. If I use those, the error from gdb is more terse: (gdb) run --runMode alignReads --outSAMtype BAM Unsorted --outSAMattributes NH HI NM MD AS XS jM jI --readNameSeparator space --outFilterMultimapScoreRange 1 --outFilterMismatchNoverLmax 0.05 --scoreGapNoncan -20 --scoreGapGCAG -4 --scoreGapATAC -8 --scoreDelOpen -1 --scoreDelBase -1 --scoreInsOpen -1 --scoreInsBase -1 --alignEndsType Local --seedSearchStartLmax 50 --seedPerReadNmax 100000 --seedPerWindowNmax 1000 --alignTranscriptsPerReadNmax 100000 --alignTranscriptsPerWindowNmax 10000 --genomeDir /tgac/workarea/users/venturil/Private/Wheat/NewRelease/Reference/ --runThreadN 49 --readFilesIn /tgac/workarea/users/venturil/Private/Wheat/Reads/PacBio/FirstBatch/IsoSeq/ReadsOfInsert/A02_1/data/isoseq_flnc.fasta Starting program: /tgac/software/testing/star/2.5.0a/src/STAR-STAR_2.5.0a/bin/Linux_x86_64_static/STARlong --runMode alignReads --outSAMtype BAM Unsorted --outSAMattributes NH HI NM MD AS XS jM jI --readNameSeparator space --outFilterMultimapScoreRange 1 --outFilterMismatchNoverLmax 0.05 --scoreGapNoncan -20 --scoreGapGCAG -4 --scoreGapATAC -8 --scoreDelOpen -1 --scoreDelBase -1 --scoreInsOpen -1 --scoreInsBase -1 --alignEndsType Local --seedSearchStartLmax 50 --seedPerReadNmax 100000 --seedPerWindowNmax 1000 --alignTranscriptsPerReadNmax 100000 --alignTranscriptsPerWindowNmax 10000 --genomeDir /tgac/workarea/users/venturil/Private/Wheat/NewRelease/Reference/ --runThreadN 49 --readFilesIn /tgac/workarea/users/venturil/Private/Wheat/Reads/PacBio/FirstBatch/IsoSeq/ReadsOfInsert/A02_1/data/isoseq_flnc.fasta [Thread debugging using libthread_db enabled] [.. threads starting .. ] [Thread 0x2b0189410700 (LWP 585970) exited] [ .. threads exiting .. ] Program received signal SIGSEGV, Segmentation fault. |
Also to be clear, the problem is triggered by something in the size of the database. If I test STAR using 4 different non-overlapping and globally comprehensive subsections of the data, i.e. all 3 subgenomes and the unassigned fraction, STAR performs as expected. |
One thing: I just rechecked the logs and realized that maybe there might be something amiss with the indices. A previous version of this genome that we were using led STAR to create a Suffix Array with 12,935,398,406 (~13 billion) indices; whereas the current version contains 25315452282 (~25 billion). Might this be at the root of the error? |
Submitted a pull merge request dealing with this bug. |
Hi Luca, there is something strange about the genome index size, could you please send me the Log.out file of the genome generation run? Cheers |
Sure thing, here it is (in gzip format). Incidentally, I think I will try your new commits for the STARlong - the pull merge I requested seems to function only for Illumina reads. Update: I tested the new commits, unfortunately the bug is still present for PacBio reads. Cheers |
Hi Luca, the Log.out from genome generation step uses default --genomeSAsparseD, however, the mapping was done with the genome that had --genomeSAsparseD 3. Also it seems that the genome generation was done with one of the "2.5.0a_alpha" patches. Cheers |
Hi,
unfortunately I have found another bug, always on wheat. This particular crash happens on the new wheat genome assembly (see here: http://www.tgac.ac.uk/news/241/68/The-Genome-Analysis-Centre-announces-an-important-milestone-in-wheat-research/).
Command line:
--runMode alignReads --outSAMtype BAM Unsorted --readNameSeparator space --genomeDir /tgac/workarea/users/venturil/Private/Wheat/NewRelease/Reference/ --runThreadN 16 --readFilesIn reads.fa
Starting program: /tgac/software/testing/star/2.5.0a/x86_64/bin/STARlong --runMode alignReads --outSAMtype BAM Unsorted --readNameSeparator space --genomeDir /tgac/workarea/users/venturil/Private/Wheat/NewRelease/Reference/ --runThreadN 16 --readFilesIn reads.fa
Detailed stack trace obtained by running said command from inside GDB:
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 18446744073709551615) >= this->size() (which is 735943)
Program received signal SIGABRT, Aborted.
0x00002aaaabd0f8a5 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00002aaaabd0f8a5 in raise () from /lib64/libc.so.6
#1 0x00002aaaabd11085 in abort () from /lib64/libc.so.6
#2 0x00002aaaab15e115 in __gnu_cxx::__verbose_terminate_handler () at ../../../../gcc-4.9.1/libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00002aaaab15c176 in __cxxabiv1::__terminate (handler=) at ../../../../gcc-4.9.1/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x00002aaaab15c1c1 in std::terminate () at ../../../../gcc-4.9.1/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5 0x00002aaaab15c3d8 in __cxxabiv1::__cxa_throw (obj=0x70e860, tinfo=0x2aaaab3f3670 , dest=0x2aaaab175750 std::out_of_range::~out_of_range())
#6 0x00002aaaab1b768f in std::__throw_out_of_range_fmt (__fmt=) at ../../../../../gcc-4.9.1/libstdc++-v3/src/c++11/functexcept.cc:101
#7 0x0000000000443d02 in Junction::outputStream(std::ostream&, Parameters*) ()
#8 0x0000000000445446 in outputSJ(ReadAlignChunk*, Parameters) ()
#9 0x0000000000413124 in main ()
I attach a ZIP file with the Log
Debug_wheat.txt
Unfortunately, the whole genome itself and its indices are far less portable than the 3B chromosome, and the core dump is ~70GB.
The text was updated successfully, but these errors were encountered: