blastx segfault #399

nextgenusfs · 2020-10-17T16:31:50Z

A few users have reported an issue with diamond v2.0.4 blastx seg faulting, some more info here: nextgenusfs/funannotate#503

$ diamond blastx --threads 80 -q genome.softmasked.fa --db diamond -o diamond.matches.tab -e 1e-10 -k 0 --more-sensitive -f 6 sseqid slen sstart send qseqid qlen qstart qend pident length evalue score qcovhsp qframe

...
Building query seed array... [0.42s]
Computing hash join... [0.215s]
Building seed filter... [0.038s]
Searching alignments... [1.321s]
Deallocating buffers... [0.096s]
Clearing query masking... [0.285s]
Computing alignments...
Segmentation fault (core dumped)

Possible it could be related to #397 as I don't see this same error on my smaller tests.

The text was updated successfully, but these errors were encountered:

estolle · 2020-10-17T20:29:17Z

Same here.

I ran diamond on a database of proteins I previously used successfully. The difference now is only the fasta sequence I use to compare to the DB. This time it has some very long contigs/scaffold (up to 16 Mb).

I tried reducing the threads, excluding unaligned sequences, -c1 option as suggested for better performance and reduction of sensitivity (from 1e-10 to 1e-8).
Diamond v2.0.4.142

diamond blastx --threads 8 -q /scratch/ek/stingless.bee.genomics/annotation/funannotate/TetragonulaCarbonaria/2020_10_16_TetragonulaCarbonaria/predict_misc/genome.softmasked.fa --db diamond -o diamond.matches.tab -e 1e-10 -k 0 --more-sensitive -f 6 sseqid slen sstart send qseqid qlen qstart qend pident length evalue score qcovhsp qframe

Computing alignments...
segmentation fault

diamond blastx --threads 80 -q /scratch/ek/stingless.bee.genomics/annotation/funannotate/TetragonulaCarbonaria/2020_10_16_TetragonulaCarbonaria/predict_misc/genome.softmasked.fa --db diamond -o diamond.matches.tab -e 1e-10 -k 0 --more-sensitive -f 6 sseqid slen sstart send qseqid qlen qstart qend pident length evalue score qcovhsp qframe

Computing alignments...
segmentation fault

diamond blastx --threads 6 --log -q /scratch/ek/stingless.bee.genomics/annotation/funannotate/TetragonulaCarbonaria/2020_10_16_TetragonulaCarbonaria/predict_misc/genome.softmasked.fa --db diamond -o diamond.matches.tab -e 1e-10 -k 0 --more-sensitive -f 6 sseqid slen sstart send qseqid qlen qstart qend pident length evalue score qcovhsp qframe

Computing alignments...
no further output (as above), yet no segfault (still running)

diamond blastx -c1 --log --threads 80 -q /scratch/ek/stingless.bee.genomics/annotation/funannotate/TetragonulaCarbonaria/2020_10_16_TetragonulaCarbonaria/predict_misc/genome.softmasked.fa --db diamond -o diamond.matches.tab -e 1e-8 -k 0 --more-sensitive -f 6 sseqid slen sstart send qseqid qlen qstart qend pident length evalue score qcovhsp qframe

...
Queries=0 size=3.07422 max_size=3.07422 next=R3_2 ETA=infs
Queries=0 size=3.07422 max_size=3.07422 next=R3_2 ETA=infs
Segmentation fault (core dumped)
many lines of output such as the last two above, then Segfault

diamond blastx --unal 0 -c1 --log --threads 20 -q /scratch/ek/stingless.bee.genomics/annotation/funannotate/TetragonulaCarbonaria/2020_10_16_TetragonulaCarbonaria/predict_misc/genome.softmasked.fa --db diamond -o diamond.matches.tab -e 1e-10 -k 0 --more-sensitive -f 6 sseqid slen sstart send qseqid qlen qstart qend pident length evalue score qcovhsp qframe

Queries=0 size=2.68359 max_size=2.68359 next=R3_2 ETA=infs
Queries=0 size=2.68359 max_size=2.68359 next=R3_2 ETA=infs
Segmentation fault (core dumped)
many lines of output such as the last two above, then Segfault

any idea how to fix or if the newer very contiguous genome sequences are a problem?

bbuchfink · 2020-10-18T11:51:27Z

I was not able to reproduce a segfault in blastx in a quick test. Could you maybe make your query file available to me and let me know the database you're using, so I can look into this.

For very long queries, it would also be worth a try to use frameshift alignment mode which should work better in these cases (even if you don't expect frameshifts).

estolle · 2020-10-18T13:05:46Z

Hi

How can I activate te frameshift alignment mode? Its not clear to me which would be the correct option (diamond help)?

I ran a few more tests. It seems the segfault occurs with the first contig. Its 24 Mb large. If I split it at stretches of N I get a 7 Mb, a 15 Mb and a 2 Mb piece. The latter works, the two large ones not. If I split the first 7Mb contig further into 3 pieces, all three pieces work.

I'll send you a copy of the fasta (first contig) and the DB this afternoon.

Thanks alot for your help!

Best
Eckart

bbuchfink · 2020-10-18T13:07:49Z

For the frameshift mode, use -F with the penalty, for example -F 15.

estolle · 2020-10-18T13:52:20Z

I emailed you a small test dataset.

Your suggestion of using -F 15 appears to work for this small dataset! Nice! I am running the full contig /database now to see.

Could this error suggest that there are lots of frameshifts present in the fasta?
If I use this setting (-F 15) as a precautionary measure all the time, would this affect the quality of the results in "normal" cases?

estolle · 2020-10-18T15:50:56Z

So the first 7Mb of contig 1 vs a tiny test DB of proteins seemed to have worked with the -F 15 option.

I tried running a larger fasta file against the full DB and it runs out of RAM (I had 750 Gb RAM) and got killed. Same if I reduce the DB (to the tiny test DB of proteins). Running only contig 1 (24 Mb) against the test DB is ok with the RMA (spikes every now n then to up to 170 Gb), but the throws an error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std:bad_alloc
Aborted (core dumped)

bbuchfink · 2020-10-18T16:27:14Z

There was a problem with memory usage in the frameshift mode (see other issue). Please try again using the latest commit.

estolle · 2020-10-18T18:37:13Z

Is there a binary for the latest commit?

I cannot compile the github clone, neither on my sytem nor within the conda env

[ 81%] Building CXX object CMakeFiles/diamond.dir/src/basic/value.cpp.o
/home/ek/progz/diamond/src/tools/roc.cpp: In constructor ‘FamilyMapping::FamilyMapping(const string&)’:
/home/ek/progz/diamond/src/tools/roc.cpp:81:30: error: converting to ‘std::tuple<char, int>’ from initializer list would use explicit constructor ‘constexpr std::tuple<_T1, _T2>::tuple(_U1&&, _U2&&) [with _U1 = char&; _U2 = int&; = void; _T1 = char; _T2 = int]’
fam2fold[i.first->second] = { domain_class[0], fold };
^
[ 82%] Building CXX object CMakeFiles/diamond.dir/src/tools/merge_tsv.cpp.o
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-unknown-warning-option’
cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’
cc1plus: warning: unrecognized command line option ‘-Wno-implicit-fallthrough’
CMakeFiles/diamond.dir/build.make:2030: recipe for target 'CMakeFiles/diamond.dir/src/tools/roc.cpp.o' failed
make[2]: *** [CMakeFiles/diamond.dir/src/tools/roc.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
CMakeFiles/Makefile2:180: recipe for target 'CMakeFiles/diamond.dir/all' failed
make[1]: *** [CMakeFiles/diamond.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

estolle · 2020-10-18T19:43:19Z

I tried a few options with CMAKE but no success

[ 2%] Building CXX object CMakeFiles/arch_avx2.dir/src/dp/swipe/banded_3frame_swipe.cpp.o
In file included from /home/ek/progz/diamond/src/dp/swipe/swipe.cpp:28:0:
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_GENERIC::AsyncTargetBuffer<_t>::max_len() const [with _t = signed char]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_GENERIC::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = ARCH_GENERIC::score_vector; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:420:192: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < target_it.count; ++i)
^
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_GENERIC::AsyncTargetBuffer<_t>::max_len() const [with _t = short int]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_GENERIC::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = ARCH_GENERIC::score_vector; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:424:193: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_GENERIC::AsyncTargetBuffer<_t>::max_len() const [with _t = int]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_GENERIC::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = int; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:427:179: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
cc1plus: warning: unrecognized command line option ‘-Wno-unknown-warning-option’
cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’
cc1plus: warning: unrecognized command line option ‘-Wno-implicit-fallthrough’
[ 3%] Building CXX object CMakeFiles/arch_sse4_1.dir/src/dp/swipe/swipe.cpp.o
[ 4%] Building CXX object CMakeFiles/arch_generic.dir/src/dp/swipe/banded_swipe.cpp.o
[ 5%] Building CXX object CMakeFiles/arch_avx2.dir/src/dp/swipe/swipe.cpp.o
[ 5%] Building CXX object CMakeFiles/arch_sse4_1.dir/src/dp/swipe/banded_swipe.cpp.o
In file included from /home/ek/progz/diamond/src/dp/swipe/swipe.cpp:28:0:
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_SSE4_1::AsyncTargetBuffer<_t>::max_len() const [with _t = signed char]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_SSE4_1::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = ARCH_SSE4_1::score_vector; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:420:192: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < target_it.count; ++i)
^
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_SSE4_1::AsyncTargetBuffer<_t>::max_len() const [with _t = short int]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_SSE4_1::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = ARCH_SSE4_1::score_vector; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:424:193: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_SSE4_1::AsyncTargetBuffer<_t>::max_len() const [with _t = int]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_SSE4_1::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = int; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:427:179: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from /home/ek/progz/diamond/src/dp/swipe/swipe.cpp:28:0:
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_AVX2::AsyncTargetBuffer<_t>::max_len() const [with _t = signed char]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_AVX2::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = ARCH_AVX2::score_vector; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:420:192: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < target_it.count; ++i)
^
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_AVX2::AsyncTargetBuffer<_t>::max_len() const [with _t = short int]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_AVX2::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = ARCH_AVX2::score_vector; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:424:193: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h: In instantiation of ‘int ARCH_AVX2::AsyncTargetBuffer<_t>::max_len() const [with _t = int]’:
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:356:35: required from ‘std::__cxx11::list DP::Swipe::ARCH_AVX2::swipe(const sequence&, Frame, DynamicIterator&, _cbs, int, std::vector&, Statistics&) [with _sv = int; _traceback = DP::VectorTraceback; _cbs = const signed char*]’
/home/ek/progz/diamond/src/dp/swipe/swipe.cpp:427:179: required from here
/home/ek/progz/diamond/src/dp/swipe/target_iterator.h:233:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
cc1plus: warning: unrecognized command line option ‘-Wno-unknown-warning-option’
cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’
cc1plus: warning: unrecognized command line option ‘-Wno-implicit-fallthrough’
[ 6%] Building CXX object CMakeFiles/arch_generic.dir/src/search/collision.cpp.o
cc1plus: warning: unrecognized command line option ‘-Wno-unknown-warning-option’
cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’
cc1plus: warning: unrecognized command line option ‘-Wno-implicit-fallthrough’
[ 7%] Building CXX object CMakeFiles/arch_avx2.dir/src/dp/swipe/banded_swipe.cpp.o

...

[ 73%] Building CXX object CMakeFiles/diamond.dir/src/align/gapped.cpp.o
[ 74%] Building CXX object CMakeFiles/diamond.dir/src/align/culling.cpp.o
[ 75%] Building CXX object CMakeFiles/diamond.dir/src/cluster/medoid.cpp.o
[ 75%] Building CXX object CMakeFiles/diamond.dir/src/cluster/cluster_registry.cpp.o
[ 76%] Building CXX object CMakeFiles/diamond.dir/src/cluster/multi_step_cluster.cpp.o
[ 77%] Building CXX object CMakeFiles/diamond.dir/src/cluster/mcl.cpp.o
[ 77%] Building CXX object CMakeFiles/diamond.dir/src/align/output.cpp.o
[ 78%] Building CXX object CMakeFiles/diamond.dir/src/tools/roc.cpp.o
/home/ek/progz/diamond/src/tools/roc.cpp: In constructor ‘FamilyMapping::FamilyMapping(const string&)’:
/home/ek/progz/diamond/src/tools/roc.cpp:81:30: error: converting to ‘std::tuple<char, int>’ from initializer list would use explicit constructor ‘constexpr std::tuple<_T1, _T2>::tuple(_U1&&, _U2&&) [with _U1 = char&; _U2 = int&; = void; _T1 = char; _T2 = int]’
fam2fold[i.first->second] = { domain_class[0], fold };
^
[ 79%] Building CXX object CMakeFiles/diamond.dir/src/test/data.cpp.o
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-unknown-warning-option’
cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’
cc1plus: warning: unrecognized command line option ‘-Wno-implicit-fallthrough’
CMakeFiles/diamond.dir/build.make:2030: recipe for target 'CMakeFiles/diamond.dir/src/tools/roc.cpp.o' failed
make[2]: *** [CMakeFiles/diamond.dir/src/tools/roc.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
CMakeFiles/Makefile2:180: recipe for target 'CMakeFiles/diamond.dir/all' failed
make[1]: *** [CMakeFiles/diamond.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

the 2.0.4 release compiles without problems
wget http://github.com/bbuchfink/diamond/archive/v2.0.4.tar.gz
tar xzf v2.0.4.tar.gz
cd diamond-2.0.4
mkdir bin
cd bin
cmake ..
make -j4

bbuchfink · 2020-10-19T09:03:31Z

I have tried to fix the compiler error here: 7c526e0

estolle · 2020-10-19T11:03:43Z

Awesome! It compiled now. Thanks so much for fixing this so super quick!

I'll shortly try how it works with my dataset.

estolle · 2020-10-19T12:59:38Z

It works in under 1 minute and with a tiny RAM footprint if I use the -F 15 option (without it, there is still a segfault).

Thanks for fixing it so rapidly!

bbuchfink · 2020-10-19T20:04:04Z

Glad it is working now. I'll look into the segfault too but it should be fine using the frameshift mode.

bbuchfink · 2021-02-12T13:19:34Z

Sorry this took longer, but the segfault should be fixed in the latest release.

bbuchfink closed this as completed Feb 12, 2021

estolle mentioned this issue Feb 12, 2021

funannotate predict fails suddenly at diamond/exonerate step nextgenusfs/funannotate#503

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blastx segfault #399

blastx segfault #399

nextgenusfs commented Oct 17, 2020

estolle commented Oct 17, 2020

bbuchfink commented Oct 18, 2020

estolle commented Oct 18, 2020

bbuchfink commented Oct 18, 2020

estolle commented Oct 18, 2020

estolle commented Oct 18, 2020

bbuchfink commented Oct 18, 2020

estolle commented Oct 18, 2020

estolle commented Oct 18, 2020

bbuchfink commented Oct 19, 2020

estolle commented Oct 19, 2020

estolle commented Oct 19, 2020

bbuchfink commented Oct 19, 2020

bbuchfink commented Feb 12, 2021

blastx segfault #399

blastx segfault #399

Comments

nextgenusfs commented Oct 17, 2020

estolle commented Oct 17, 2020

bbuchfink commented Oct 18, 2020

estolle commented Oct 18, 2020

bbuchfink commented Oct 18, 2020

estolle commented Oct 18, 2020

estolle commented Oct 18, 2020

bbuchfink commented Oct 18, 2020

estolle commented Oct 18, 2020

estolle commented Oct 18, 2020

bbuchfink commented Oct 19, 2020

estolle commented Oct 19, 2020

estolle commented Oct 19, 2020

bbuchfink commented Oct 19, 2020

bbuchfink commented Feb 12, 2021