Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Caught unhandled exception: std::bad_alloc in both 2.3.2 and 2.3.3 #46

Closed
YiweiNiu opened this issue May 5, 2018 · 6 comments
Closed

Comments

@YiweiNiu
Copy link

YiweiNiu commented May 5, 2018

Hi, I got this error messages when using version 2.3.2 and version 2.3.3.

The genome is about 2G, and default parameters were used.

version 2.3.2

[2018-04-18 18:38:40] INFO: Running Flye 2.3.2-release
[2018-04-18 18:38:40] INFO: Assembling reads
[2018-04-18 18:38:40] INFO: Reading sequences
[2018-04-18 21:39:09] INFO: Generating solid k-mer index
[2018-04-18 21:39:35] INFO: Counting kmers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-04-18 22:28:15] INFO: Counting kmers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-04-19 00:03:13] INFO: Filling index table
[2018-04-19 02:32:35] ERROR: Caught unhandled exception: std::bad_alloc
[2018-04-19 02:32:35] ERROR: 	flye-assemble(_Z16exceptionHandlerv+0xd0) [0x42f4a0]
[2018-04-19 02:32:35] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e0e6) [0x2b08dbac10e6]
[2018-04-19 02:32:35] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e131) [0x2b08dbac1131]
[2018-04-19 02:32:35] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e349) [0x2b08dbac1349]
[2018-04-19 02:32:35] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e869) [0x2b08dbac1869]
[2018-04-19 02:32:35] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(_Znam+0x9) [0x2b08dbac18c9]
[2018-04-19 02:32:35] ERROR: 	flye-assemble(_ZN11VertexIndex10buildIndexEii+0x9c6) [0x44f3d6]
[2018-04-19 02:32:35] ERROR: 	flye-assemble(main+0xaf8) [0x434378]
[2018-04-19 02:32:35] ERROR: 	/lib64/libc.so.6(__libc_start_main+0xfd) [0x3fbbe1ed5d]
[2018-04-19 02:32:35] ERROR: 	flye-assemble() [0x41d275]
[2018-04-19 02:32:57] ERROR: Command '['flye-assemble', '-l', '/home/zhangll/Tasks/Gouqi/Third_assembl/Flye/flye.log', '-t', '16', '-v', '5000', '/home/zhangll/Tasks/Gouqi/data/Pacbio/all.fasta', '/home/zhangll/Tasks/Gouqi/Third_assembl/Flye/0-assembly/draft_assembly.fasta', '2202009600', '/home/niuyw/software/Flye/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1

version 2.3.3

[2018-04-28 05:04:03] INFO: Running Flye 2.3.3-g47cdd0b
[2018-04-28 05:04:03] INFO: Assembling reads
[2018-04-28 05:04:03] INFO: Running with k-mer size: 17
[2018-04-28 05:04:03] INFO: Reading sequences
[2018-04-28 05:59:54] ERROR: parse error in /parastor300/niuyw/Project/Goqi_genome_180207/data/Pacbio/all.fq.gz on line 37943506: Fastq fromat error
[2018-04-28 05:59:58] ERROR: Command '['flye-assemble', '-l', '/parastor300/niuyw/Project/Goqi_genome_180207/flye/run1/flye.log', '-t', '30', '/parastor300/niuyw/Project/Goqi_genome_180207/data/Pacbio/all.fq.gz', '/parastor300/niuyw/Project/Goqi_genome_180207/flye/run1/0-assembly/draft_assembly.fasta', '2147483648', '/home/niuyw/software/Flye-2.3.3/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1
Finish time is 2018/04/28--05:59
niuyw@admin:/parastor300/niuyw/Project/Goqi_genome_180207/flye/run2$ cat ../flye.g.run1.e55155 
Start time is 2018/04/28--15:49
[2018-04-28 15:49:49] INFO: Running Flye 2.3.3-g47cdd0b
[2018-04-28 15:49:49] INFO: Assembling reads
[2018-04-28 15:49:49] INFO: Running with k-mer size: 17
[2018-04-28 15:49:49] INFO: Reading sequences
[2018-04-28 18:05:07] INFO: Reads N50/90: 16659 / 5780
[2018-04-28 18:05:23] INFO: Selected minimum overlap 5000
[2018-04-28 18:05:35] INFO: Expected read coverage: 102
[2018-04-28 18:05:35] INFO: Generating solid k-mer index
[2018-04-28 18:08:19] INFO: Counting kmers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-04-28 18:33:35] INFO: Counting kmers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-05-01 21:49:37] INFO: Filling index table
[2018-05-04 12:43:47] ERROR: Caught unhandled exception: std::bad_alloc
[2018-05-04 12:43:47] ERROR: 	flye-assemble(_Z16exceptionHandlerv+0xd0) [0x431590]
[2018-05-04 12:43:47] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e0e6) [0x2b52528950e6]
[2018-05-04 12:43:47] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e131) [0x2b5252895131]
[2018-05-04 12:43:47] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e349) [0x2b5252895349]
[2018-05-04 12:43:47] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(+0x5e869) [0x2b5252895869]
[2018-05-04 12:43:47] ERROR: 	/home/software/gcc-4.9.3/lib64/libstdc++.so.6(_Znam+0x9) [0x2b52528958c9]
[2018-05-04 12:43:47] ERROR: 	flye-assemble(_ZN11VertexIndex10buildIndexEii+0x9c6) [0x452056]
[2018-05-04 12:43:47] ERROR: 	flye-assemble(main+0xbe5) [0x436595]
[2018-05-04 12:43:47] ERROR: 	/lib64/libc.so.6(__libc_start_main+0xfd) [0x3fbbe1ed5d]
[2018-05-04 12:43:47] ERROR: 	flye-assemble() [0x41dbc5]
[2018-05-04 12:46:17] ERROR: Command '['flye-assemble', '-l', '/parastor300/niuyw/Project/Goqi_genome_180207/flye/run1/flye.log', '-t', '40', '/parastor300/niuyw/Project/Goqi_genome_180207/data/Pacbio/all.fasta', '/parastor300/niuyw/Project/Goqi_genome_180207/flye/run1/0-assembly/draft_assembly.fasta', '2147483648', '/home/niuyw/software/Flye-2.3.3/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1

BTW, I also ran Flye 2.3.3 based on the corrected reads of Canu, and it ran successfully. Here is the logs if it's useful.

[2018-04-28 05:15:35] INFO: Running Flye 2.3.3-g47cdd0b
[2018-04-28 05:15:35] INFO: Assembling reads
[2018-04-28 05:15:36] INFO: Running with k-mer size: 17
[2018-04-28 05:15:36] INFO: Reading sequences
[2018-04-28 05:49:20] INFO: Reads N50/90: 22994 / 18323
[2018-04-28 05:49:22] INFO: Selected minimum overlap 5000
[2018-04-28 05:49:24] INFO: Expected read coverage: 34
[2018-04-28 05:49:24] INFO: Generating solid k-mer index
[2018-04-28 05:49:47] INFO: Counting kmers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-04-28 05:55:09] INFO: Counting kmers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-04-28 08:55:36] INFO: Filling index table
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-04-28 17:32:09] INFO: Extending reads
[2018-04-28 18:19:00] INFO: Overlap-based coverage: 20
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-05-03 00:13:25] INFO: Assembled 6725 draft contigs
[2018-05-03 00:13:57] INFO: Generating contig sequences
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-05-03 01:03:32] INFO: Running Minimap2
[2018-05-03 10:15:46] INFO: Computing consensus
[2018-05-03 11:18:10] INFO: Alignment error rate: 0.0299390805236
[2018-05-03 11:18:34] INFO: Performing repeat analysis
[2018-05-03 11:18:35] INFO: Reading sequences
[2018-05-03 11:50:14] INFO: Building repeat graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-05-03 18:02:10] INFO: Aligning reads to the graph
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-05-04 02:18:05] INFO: Aligned sequence: 137844577112 / 149133911062 (0.924301)
[2018-05-04 02:18:34] INFO: Mean edge coverage: 38
[2018-05-04 02:20:09] INFO: Resolving repeats
[2018-05-04 11:02:04] INFO: Generating contigs
[2018-05-04 12:05:35] INFO: Generated 17311 contigs
[2018-05-04 14:08:03] INFO: Polishing genome (1/1)
[2018-05-04 14:08:03] INFO: Running Minimap2
[2018-05-04 21:32:38] INFO: Separating alignment into bubbles
[2018-05-05 03:50:13] INFO: Alignment error rate: 0.0230640593152
[2018-05-05 03:50:14] INFO: Correcting bubbles
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
[2018-05-05 08:18:15] INFO: Assembly statistics:

	Total length:	1886554189
	Contigs:	13177
	Scaffolds:	13049
	Scaffolds N50:	315687
	Largest scf:	2690265
	Mean coverage:	34

[2018-05-05 08:18:15] INFO: Final assembly: /parastor300/niuyw/Project/Goqi_genome_180207/flye/run2/scaffolds.fasta

Do you know what could have cause it? Thanks in advance!

Bests,
Yiwei Niu

@fenderglass
Copy link
Owner

Hi,

bad_alloc usually means that system ran out of memory. How much RAM does your machine have? It seems that you have raw reads at 100x coverage - you might try to downsample them to, say 40x (take the longest ones), this should reduce memory requirements. You also should be able to rerun the repeat resolution step with the entire set of reads afterwards.

Notice that Canu reads have 30x coverage - so that is why less memory was required. If you running with error-corrected reads (which is also an option), make sure you are using 'pacbio-corr', not 'pacbio-raw' option.

@YiweiNiu
Copy link
Author

YiweiNiu commented May 8, 2018

Thank you for your reply! The computer node I submitted the job has 2T RAM. I don't know if it's enough.

I'll try to downsample the raw data. A basic question: what's the difference between using all raw data (say 100X) and using downsampling data (say longest 50X)? except the memory required.

@fenderglass
Copy link
Owner

You might have extra connectivity information in these 100x reads (you can resolve more repeats, for example). But some studies suggest (Canu paper, for example) that you don't really need more than 40x in general (but it, of course, also depends on the genome complexity, ploidy etc..). Plus, extra coverage helps to get a good final consensus.

@YiweiNiu
Copy link
Author

YiweiNiu commented May 9, 2018

I see. Thank you!

@njaupan
Copy link

njaupan commented Oct 24, 2018

Hi, I have meet the same issue as "ERROR: Caught unhandled exception: std::bad_alloc"
when i ran 30X and 90X ONT raw reads in our clsuter with 2 nodes, 300G RAM.
I subsampled reads by canu-correction, then flye works for both data. However, why 30X ONT raw reads didn't works?

Here is error message
~/software/Flye/bin/flye -t 36 --nano-raw $fasta --genome-size 130m --out-dir
[2018-09-21 17:02:25] INFO: Running Flye 2.3.5-release
[2018-09-21 17:02:25] INFO: Assembling reads
[2018-09-21 17:02:25] INFO: Running with k-mer size: 17
[2018-09-21 17:02:25] INFO: Reading sequences
[2018-09-21 17:07:05] INFO: Reads N50/90: 21104 / 5385
[2018-09-21 17:07:05] INFO: Selected minimum overlap 5000
[2018-09-21 17:07:05] INFO: Expected read coverage: 26
[2018-09-21 17:07:05] INFO: Generating solid k-mer index
[2018-09-21 17:10:26] INFO: Counting kmers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2018-09-21 17:11:11] INFO: Counting kmers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% [2018-09-21 17:18:04] ERROR: Caught unhandled exception: std::bad_alloc
[2018-09-21 17:18:04] ERROR: flye-assemble(_Z16exceptionHandlerv+0xd0) [0x43f530]
[2018-09-21 17:18:04] ERROR: /lib64/libstdc++.so.6(+0x5e926) [0x7fc45deec926]
[2018-09-21 17:18:04] ERROR: /lib64/libstdc++.so.6(+0x5e953) [0x7fc45deec953]
[2018-09-21 17:18:04] ERROR: /lib64/libstdc++.so.6(+0xb5275) [0x7fc45df43275]
[2018-09-21 17:18:04] ERROR: /lib64/libpthread.so.0(+0x7df5) [0x7fc45d761df5]
[2018-09-21 17:18:04] ERROR: /lib64/libc.so.6(clone+0x6d) [0x7fc45d48f1ad]
[2018-09-21 17:18:05] ERROR: Command '['flye-assemble', '-l', '/home/panpan/assembly_flye/col.NF.flye2/flye.log', '-t', '36', '/home/panpan/raw_porechoped_data/porechop_col_fastq/fasta/before_cont/porechop_col.NF.reads.fasta', '/home/panpan/assembly_flye/col.NF.flye2/0-assembly/draft_assembly.fasta', '130000000', '/home/panpan/software/Flye/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1

Thank you
panpan

@fenderglass
Copy link
Owner

Hi,

Looks strange, for a genome of ~130m and 30x coverage it should not use more than 50G.

Does the node that you are using to run Flye has 300G RAM (or you refer to the total memory of all nodes)? Could you send me the file.log file?

It would be also helpful if you can watch the memory consumption to make sure that it indeed ran out of memory. You can either manually watch top/htop, or use this script - https://github.com/jhclark/memusg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants