Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error - std::bad_alloc during "calling consensus sequence between anchors" #5

Open
jpummil opened this issue Feb 6, 2020 · 8 comments
Assignees

Comments

@jpummil
Copy link

jpummil commented Feb 6, 2020

[NOTE] calling consensus sequence between anchors...
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Compute resource seems to be ample from a memory perspective...node has 768GB of ram. Monitoring with top, the task seemed to be around 330GB when error occurred.

Input data:
(Haslr) pinnacle-l4:jpummil:/scrfs/storage/jpummil/C.vittatus$ ls -lh *.fastq
-rw-r--r--. 1 jpummil jpummil 13G Feb 6 11:07 NU2WGS_R1.fastq
-rw-r--r--. 1 jpummil jpummil 13G Feb 6 11:09 NU2WGS_R2.fastq
-rw-r--r--. 1 jpummil jpummil 31G Feb 6 11:16 Q1133andQ1171.fastq

input script:
haslr.py -t 24 -g 700m -o RUN1 -l Q1133andQ1171.fastq -x pacbio -s NU2WGS_R1.fastq NU2WGS_R2.fastq

@jpummil jpummil closed this as completed Feb 7, 2020
@haghshenas
Copy link
Collaborator

Hi Jeff,

Sorry that I was busy with other stuff and got back to this so late.
Is the issue fixed? If yes, could you please let me know what the problem was and how you fixed it?

@jstrickland63
Copy link

Hello, I also got this same error. I ran it on a node with 80 CPUs and 1500GB of RAM. I was able to run the sample dataset with no problems. The output is below for the .err file. Please let me know what other information you might need. I will also start tinkering with parameters as jelber2 did in #11 and see if I get lucky.

[NOTE] number of threads: 80

[NOTE] loading contig sequences...
processing file: /scratch2/jlstrck/Snake_HASLR_8March2020/sr_k49_a3.contigs.nooverlap.fa... Done in 7.63 CPU seconds (7.63 real seconds)
loaded 16708266 contigs
elapsed time 8.25 CPU seconds (10.37 real seconds)

[NOTE] calculating kmer frequency of unique contigs
mean: 67.42
elapsed time 9.85 CPU seconds (11.96 real seconds)

[NOTE] loading long read sequences...
processing file: /scratch2/jlstrck/Snake_HASLR_8March2020/lr25x.fasta... Done in 70.21 CPU seconds (70.23 real seconds)
loaded 4213353 long reads
elapsed time 80.08 CPU seconds (82.21 real seconds)

[NOTE] loading alignment between contigs and long reads...
processing file: /scratch2/jlstrck/Snake_HASLR_8March2020/map_contigs_k49_a3_lr25x.paf... Done in 103.05 CPU seconds (103.11 real seconds)
loaded 9169927 alignments
elapsed time 192.75 CPU seconds (218.77 real seconds)

[NOTE] fixing overlapping alignments...
elapsed time 225.54 CPU seconds (251.56 real seconds)

[NOTE] building compact long reads...
elapsed time 229.36 CPU seconds (255.92 real seconds)

[NOTE] building the backbone graph...
elapsed time 239.56 CPU seconds (268.25 real seconds)

[NOTE] cleaning weak edges...
removed 192666 edges
elapsed time 247.01 CPU seconds (277.75 real seconds)

[NOTE] cleaning tips...
removed 1324 tips
elapsed time 254.49 CPU seconds (287.72 real seconds)

[NOTE] cleaning simple bubbles...
removed 11407 simple bubbles
elapsed time 261.54 CPU seconds (296.75 real seconds)

[NOTE] cleaning super bubbles...
removed 265 super bubbles
elapsed time 268.81 CPU seconds (306.35 real seconds)

[NOTE] cleaning small bubbles...
removed 24 small bubbles
elapsed time 275.91 CPU seconds (316.15 real seconds)

[NOTE] calculating long read coordinates between anchors...
elapsed time 6082.05 CPU seconds (394.16 real seconds)

[NOTE] calling consensus sequence between anchors...
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

@haghshenas haghshenas reopened this Mar 10, 2020
@haghshenas
Copy link
Collaborator

I think I was able to reproduce this bug. I'm working on it and will update you.

@palfalvi
Copy link

palfalvi commented May 8, 2020

Hi @haghshenas !

I also got the same error recently on 35 CPUs and 2800 GB RAM trying to assemble a 2.3Gbp genome with illumina and ont data. Just would like to know if there is any solution or workaround. Would be great to test this assembler!

@CIWa
Copy link

CIWa commented Jun 29, 2020

Hi,

Thank you very much for offering this great tool! For supporting the issue:
I am using a single node with 2 TB of RAM, for a 6 GB genome, PacBio CLR + Illumina PE data. RAM was okay before the crash (< 60 %). My version is 0.8a1, build successfully from source.

My call:
haslr.py -t 126 -o ./ -g 6g -l pacbio.fa -x pacbio -s R1.fastq R2.fastq

The error messages are:
[29-Jun-2020 10:07:29] assembling long reads using HASLR... failed
ERROR: "haslr_assemble" returned non-zero exit status

And in asm_contigs_k49_a3_c250_lr25x_b500_s3_sim0.85.err:
[NOTE] calling consensus sequence between anchors...
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Best wishes,
Isabel

@drs
Copy link

drs commented Aug 16, 2020

Hi @haghshenas,

I also got the same error trying to assemble a small genome (32.5 Mb) (Thalassiosira pseudonana) on two different system (Ubuntu 20.04 - 64 Gb RAM - 40 Threads and Debian 9 - 400 Gb RAM - 40 Threads).

I use different subsample (15X to 100X coverage) of a large dataset (SRA SRR7762361) to assemble the genome and the only dataset that fails with this error is the dataset with 15X coverage.

The command I used is :

$ haslr.py -o Assembly -g 12.2m -x nanopore -t 40 \
-s Illumina-R1.fastq Illumina-R2.fastq Illumina-Singles.fastq -l Nanopore-15X.fastq

I do not have any clue on why this specific dataset failed or if this is really related to this issue since this seesm to affect large gigabase sized genomes. I hope this additional information can be any valuable to you.

Regards,
Samuel Drouin

@koujiaodahan
Copy link

So, what's the solution to "terminate called after throwing an instance of 'std::bad_alloc'"

@jessiepelosi
Copy link

I've also run into this issue with the same error during calling consensus sequence between anchors...:

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

Has there been any progress on solving this issue since 2020?

Looking at this issue from STAR, it's likely not an issue with available memory but the number of headers in the initial short read assembly. Any advice/fixes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants