Segmentation fault when running in docker container #104

darcyabjones · 2018-10-26T07:20:37Z

Hi there,

I've been trying to run hhblits in a container against the precompiled uniclust30 database.
I find that it often segfaults in the first iteration some time after the first "Alternative alignment" log shows up.
The same issue occurs with hhsearch.

I've used your docker container (https://hub.docker.com/r/soedinglab/hh-suite/), and also my own which is heavily based on your alpine container (https://github.com/darcyabjones/pclust/blob/master/Dockerfiles/hhblits.Dockerfile) using hhblits v3.0-beta.3.

The VM I'm running the containers in is running Ubuntu 18.04, 16 vcpus, 48G ram.

The search completes when searching the precompiled scop90 database.
The search against uniclust30 will also complete if I compile & run on the VM directly.

Here's a small example:

ubuntu@darcy-pclust:~/pclust$ cat test.faa 
>Sn15.NS.00005 
MRFVLVVLLGLLLSVRSDVSAHHVDAAIPDSSQISNLIFPAHVARPGGENSTVISHKRRW
NGPPPAPAADDVWEKMKCKGRKFMAQMSYSDFDAGQMLPVPQNTAQSPWYLAHLYSWAYV
ISSVGEVYRSLGPGGYWGVSDFFRHISISDKCVEEGGKWIAAVITHYQQGTLVDGQRYTS
PNGEVKRASGAYFYMAVNPQGGIIVQNTLGPREAANKVYPGNYPDTELPALQKLSDMMWM
MWEYYVPAAQRTNLDFVMSLSISNPTSLSIIRRAFDSQGQVLTATPYKFDPNSDGGLALL
GSPNGARVAHFLIQRKPQVGLKTVIGIYGFESQAKSRAPCLMFKLGNLAAATPRPPVQRS
ELGPSSGAEQNMPVEETSVKRVLEQRNFVRTHIFRFDGNVTLPSEYM
ubuntu@darcy-pclust:~/pclust$ docker run --rm -it -v $(pwd):/data:rw soedinglab/hh-suite hhblits -i /data/test.faa -d /data/databases/hhuniref/uniclust30_2018_08 -o /data/test.hhr
- 06:44:25.547 INFO: Searching 15161831 column state sequences.
- 06:44:25.726 INFO: /data/test.faa is in A2M, A3M or FASTA format
- 06:44:25.728 INFO: Iteration 1
- 06:44:26.959 INFO: Prefiltering database
- 06:46:26.638 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 294315
- 06:46:31.326 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 294
- 06:46:31.326 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 294
- 06:46:31.326 INFO: Scoring 294 HMMs using HMM-HMM Viterbi alignment
- 06:46:31.498 INFO: Alternative alignment: 0

ubuntu@darcy-pclust:~/pclust$ docker run --rm -it -v $(pwd):/data:rw soedinglab/hh-suite hhsearch -i /data/test.faa -d /data/databases/hhuniref/uniclust30_2018_08 -o /data/test.hhr
- 06:49:07.435 INFO: /data/test.faa is in A2M, A3M or FASTA format
- 06:49:10.530 INFO: Searching 15161831 database HHMs without prefiltering
- 06:49:46.817 INFO: Iteration 1
- 06:49:48.316 WARNING: database contains sequences that exceeds maximum allowed size (maxres = 20001). Maxres can be increased with parameter -maxres.
- 06:49:48.432 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 15161831
- 06:49:48.433 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 15161831
- 06:49:48.433 INFO: Scoring 15161831 HMMs using HMM-HMM Viterbi alignment
- 06:49:49.462 INFO: Alternative alignment: 0

There is no output file.
Unfortunately the error doesn't propagate out of the docker container but this is the error message:

Segmentation fault (core dumped)

Any thoughts why this would be?

Cheers, Darcy

PS. Thanks so much for your work.
I've been evangelising mmseqs and hhblits to my colleages.

The text was updated successfully, but these errors were encountered:

narsapuramvijaykumar · 2018-10-29T10:43:17Z

Hello Team,

I'm facing the same issue @darcyabjones when trying to run using docker image
Below are my stdout for the program
Available resource : cpu - 4; memory - 16GB; database(pfam) size - 5GB.
hhblits -cpu 2 -i data/query/query.a3m -d data/pfam -o data/outputs/query.hhr

10:31:57.668 INFO: Searching 17929 column state sequences.
10:31:57.749 INFO: data/query/query.a3m is in A2M, A3M or FASTA format
10:31:57.770 INFO: Iteration 1
10:31:58.125 INFO: Prefiltering database
10:31:58.527 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 784
10:31:58.536 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment) : 272
10:31:58.536 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 272
10:31:58.536 INFO: Scoring 272 HMMs using HMM-HMM Viterbi alignment
10:31:58.640 INFO: Alternative alignment: 0
10:32:01.738 INFO: 272 alignments done
10:32:01.739 INFO: Alternative alignment: 1
10:32:01.771 INFO: 48 alignments done
10:32:01.771 INFO: Alternative alignment: 2
10:32:01.781 INFO: 3 alignments done
10:32:01.781 INFO: Alternative alignment: 3
10:32:01.790 INFO: 1 alignments done
10:32:01.846 INFO: Realigning 10 HMM-HMM alignments using Maximum Accuracy algorithm

Segmentation fault (core dumped)

And hhsearch as below

hhsearch -cpu 4 -i data/query/query.a3m -d data/pfam -o data/outputs/query.hhr

10:25:17.320 INFO: data/query/query.a3m is in A2M, A3M or FASTA format
10:25:17.343 INFO: Searching 17929 database HHMs without prefiltering
10:25:17.356 INFO: Iteration 1
10:25:17.543 WARNING: database contains sequences that exceeds maximum allowed size (maxres = 20001). Maxres can be increased with parameter -maxres.
10:25:17.583 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment) : 17929
10:25:17.583 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 17929
10:25:17.583 INFO: Scoring 17929 HMMs using HMM-HMM Viterbi alignment
10:25:17.793 INFO: Alternative alignment: 0

Segmentation fault (core dumped)

Thanks in advance.

Regards,
Vijay N

milot-mirdita · 2019-02-26T13:21:06Z

Can you check if this issue still happens in the new release?

If so, check on the host machine please with sysctl vm.overcommit_memory if memory overcommitment is enabled and set the value to 1 with sysctl vm.overcommit_memory=1 if not.

sabyUWO · 2019-02-28T03:13:23Z

21:25:59.268 INFO: Iteration 1
21:25:59.506 INFO: Prefiltering database
21:27:31.160 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 171339
21:27:31.962 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment) : 185
21:27:31.962 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 185
21:27:31.962 INFO: Scoring 185 HMMs using HMM-HMM Viterbi alignment
21:27:32.093 INFO: Alternative alignment: 0

Segmentation fault

I am still having this error

sabyUWO · 2019-02-28T03:14:33Z

I cant set sysctl vm.overcommit_memory=1 as i dont have rights, is there any other way?

darcyabjones · 2019-03-06T09:45:12Z

Sorry for my delayed response and thanks for getting back to me.

Running hhblits and hhsearch with the same commands i sent originally works fine now.
The overcommit_memory option was set to 0 (thanks for that tip!), but it worked with both 1 and 0.

The current docker image i used was from git commit a0ca99d62d57.

$ sudo docker run --rm -it -v $(pwd):/data:rw soedinglab/hh-suite /usr/bin/time -v hhblits -i /data/test.faa -d /data/data/uniclust30_2018_08/uniclust30_2018_08 -o /data/test.hhr -v 1
	Command being timed: "hhblits -i /data/test.faa -d /data/data/uniclust30_2018_08/uniclust30_2018_08 -o /data/test.hhr -v 1"
	User time (seconds): 504.82
	System time (seconds): 6.07
	Percent of CPU this job got: 188%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 4m 30.83s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 27683024
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 222
	Minor (reclaiming a frame) page faults: 891229
	Voluntary context switches: 3442
	Involuntary context switches: 10899
	Swaps: 0
	File system inputs: 299040
	File system outputs: 1400
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

I have found that specifying -cpu > 2 does still gives me a segfault once it hits the alignment stage.

$ sudo docker run --rm -it -v $(pwd):/data:rw soedinglab/hh-suite /usr/bin/time -v hhblits -i /data/test.faa -d /data/data/uniclust30_2018_08/uniclust30_2018_08 -o /data/test.hhr -cpu 3
- 06:16:35.906 INFO: Searching 15161831 column state sequences.

- 06:16:36.030 INFO: /data/test.faa is in A2M, A3M or FASTA format

- 06:16:36.031 INFO: Iteration 1

- 06:16:36.664 INFO: Prefiltering database

- 06:17:52.568 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 294315

- 06:17:56.102 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 294

- 06:17:56.102 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 294

- 06:17:56.102 INFO: Scoring 294 HMMs using HMM-HMM Viterbi alignment

- 06:17:56.371 INFO: Alternative alignment: 0

Command terminated by signal 11
	Command being timed: "hhblits -i /data/test.faa -d /data/data/uniclust30_2018_08/uniclust30_2018_08 -o /data/test.hhr -cpu 3"
	User time (seconds): 240.64
	System time (seconds): 4.72
	Percent of CPU this job got: 264%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 1m 32.79s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 27460848
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 728488
	Voluntary context switches: 68
	Involuntary context switches: 803
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

The same thing happens with hhsearch, it just skips the prefiltering and segfaults.
Running directly in the host VM will happily use the -cpu option, and has a much smaller max resident set size in the time output.

$ /usr/bin/time -v hhblits -i test.faa -d data/uniclust30_2018_08/uniclust30_2018_08 -o test.hhr -cpu 3 -v 1
	Command being timed: "hhblits -i test.faa -d data/uniclust30_2018_08/uniclust30_2018_08 -o test.hhr -cpu 3 -v 1"
	User time (seconds): 358.94
	System time (seconds): 4.88
	Percent of CPU this job got: 261%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 2:18.93
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 6976848
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 446
	Minor (reclaiming a frame) page faults: 806549
	Voluntary context switches: 828
	Involuntary context switches: 2481
	Swaps: 0
	File system inputs: 191472
	File system outputs: 1400
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

I had a hunch that the difference was due to musl or some other alpine peculiarity, so I just made a quick ubuntu version and it seems to work like native.

$ cat Dockerfile
FROM ubuntu:latest as builder

RUN apt-get update \
    && apt-get install -y gcc g++ cmake vim build-essential \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /opt/hh-suite
ADD . .

WORKDIR /opt/hh-suite/build
RUN cmake -DHAVE_SSE2=1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local/hh-suite .. \
    && make \
    && make install

FROM ubuntu:latest
RUN apt-get update \
    && apt-get install -y libstdc++6 libgomp1 time \
    && rm -rf /var/lib/apt/lists/*

COPY --from=builder /usr/local/hh-suite /usr/local/hh-suite

ENV HHLIB=/usr/local/hh-suite
ENV PATH="/usr/local/hh-suite/bin:/usr/local/hh-suite/scripts:${PATH}"

CMD ["hhblits"]



$ sudo docker run --rm -it -v $(pwd):/data:rw myhhsuite /usr/bin/time -v hhblits -i /data/test.faa -d /data/data/uniclust30_2018_08/uniclust30_2018_08 -o /data/test.hhr -cpu 3 -v 1
	Command being timed: "hhblits -i /data/test.faa -d /data/data/uniclust30_2018_08/uniclust30_2018_08 -o /data/test.hhr -cpu 3 -v 1"
	User time (seconds): 484.41
	System time (seconds): 5.93
	Percent of CPU this job got: 270%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 3:01.16
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 7040696
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 3374
	Minor (reclaiming a frame) page faults: 803010
	Voluntary context switches: 3797
	Involuntary context switches: 1431
	Swaps: 0
	File system inputs: 819640
	File system outputs: 1400
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

I understand that time isn't necessarily the best way to watch memory use.
But from this it seems like there's something about the alpine compilation or runtime libs that causes hhblits/hhsearch to chew through memory or do weird things with allocations/frees.
I could definitely be wrong, I'm not a C or Docker expert.

Anyway, thanks for looking at this and sorry for this monstrously long comment.

Version 3 looks really nice so far :).

milot-mirdita · 2019-03-13T12:40:25Z

Thanks for the thorough testing, I replaced the alpine base image with debian stable-slim. Would you mind trying it out again?

darcyabjones · 2019-03-14T01:34:40Z

The new version seems to run perfectly.
It happily ran with 16 cpus and used about 3GB RAM on average for uniclust 30.

Thanks :)

Hopefully this fixes the issues faced by others in this thread and is helpful.
Closing it for now.

Thanks again, Darcy

darcyabjones closed this as completed Mar 14, 2019

ch4rr0 mentioned this issue Aug 26, 2020

Segmentation fault (core dumped) (ERR): bowtie2-align exited with value 139 BenLangmead/bowtie2#297

Open

tmuncks mentioned this issue Aug 13, 2021

[Bug] openvas and nmap processes traps or segfaults frequently Secure-Compliance-Solutions-LLC/GVM-Docker#258

Closed

KSilkThread mentioned this issue Oct 23, 2021

ARM support for minifabric hyperledger-labs/minifabric#293

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault when running in docker container #104

Segmentation fault when running in docker container #104

darcyabjones commented Oct 26, 2018

narsapuramvijaykumar commented Oct 29, 2018 •

edited

milot-mirdita commented Feb 26, 2019

sabyUWO commented Feb 28, 2019

sabyUWO commented Feb 28, 2019

darcyabjones commented Mar 6, 2019

milot-mirdita commented Mar 13, 2019

darcyabjones commented Mar 14, 2019

Segmentation fault when running in docker container #104

Segmentation fault when running in docker container #104

Comments

darcyabjones commented Oct 26, 2018

narsapuramvijaykumar commented Oct 29, 2018 • edited

milot-mirdita commented Feb 26, 2019

sabyUWO commented Feb 28, 2019

sabyUWO commented Feb 28, 2019

darcyabjones commented Mar 6, 2019

milot-mirdita commented Mar 13, 2019

darcyabjones commented Mar 14, 2019

narsapuramvijaykumar commented Oct 29, 2018 •

edited