`bwa-mem2': free(): invalid next size (normal): 0x00007f3a1d66fb5 #49

yihchii · 2020-02-20T00:07:51Z

Setting:

Binary built using commit 091cf8a
Ran on AWS c5d.18xlarge with Linux 4.4.0-1099-aws and Ubuntu 16.04.6 LTS
Executing in AVX512 mode

Command:

⟫ bwa-mem2 mem -t 8 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/ERR3242672_1.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/ERR3242672_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242672_1\tPL:ILLUMINA\tPU:ERR3242672_1\tLB:ERR3242672\tSM:ER^C242672' > /dev/null

Error message:

...
[M::mem_pestat] (25, 50, 75) percentile: (827, 2626, 3427)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 8627)
[M::mem_pestat] mean and std.dev: (2383.79, 1436.55)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 11227)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[0000] 10. Calling kt_for - worker_sam
*** Error in `bwa-mem2': free(): invalid next size (normal): 0x00007f3a1d66fb50 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f48411137e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f484111c37a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f484112053c]
bwa-mem2[0x44cdf8]
bwa-mem2[0x445869]
bwa-mem2[0x42658c]
bwa-mem2[0x46a631]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f4841f286ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f48411a341d]
======= Memory map: ========
00400000-0049d000 r-xp 00000000 00:2a 75543                              /bwa-mem2/bwa-mem2.avx512bw
0069c000-006b0000 rw-p 0009c000 00:2a 75543                              /bwa-mem2/bwa-mem2.avx512bw
006b0000-006d1000 rw-p 00000000 00:00 0 
025dd000-03a99000 rw-p 00000000 00:00 0                                  [heap]
7f38f4000000-7f38f8000000 rw-p 00000000 00:00 0 
7f38f8000000-7f38fbffd000 rw-p 00000000 00:00 0 
7f38fbffd000-7f38fc000000 ---p 00000000 00:00 0 
7f390c000000-7f3910000000 rw-p 00000000 00:00 0
...

Below are the files we used:
Reference genome (hs38DH.fa as a tar file, built after 6743183): https://dl.dnanex.us/F/D/19XXQfV51qBYpP4jgY46YzpX7ZB6GQYv78Zg6B4x/hs38DH.bwa-index.avx512.v2.tar.gz
ERR3242672_1.fastq.gz (12.03 GiB): https://dl.dnanex.us/F/D/FXyYZk9kqk8Y0qVGPV1VkBpY0byfQyB1Q2zf7v84/ERR3242672_1.fastq.gz
ERR3242672_2.fastq.gz (13.77 GiB): https://dl.dnanex.us/F/D/8y5qbZbkbVYyJJZXqB1PPg1YVZ5JFK5J7QjYx82j/ERR3242672_2.fastq.gz

The text was updated successfully, but these errors were encountered:

yuk12 · 2020-04-20T07:29:39Z

I tried with latest commit on the given input. It executes to completion.
Could you try with the latest commit.

Run info:
Compiler used: icc-18, gcc.
Command line same as mentioned above.

make CXX=icpc multi
./bwa-mem2 mem -t 8 /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa /scratch/omics/genomes-indexes-reads/human/reads/ERR3242672_1.fastq.gz /scratch/omics/genomes-indexes-reads/human/reads/ERR3242672_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242672_1\tPL:ILLUMINA\tPU:ERR3242672_1\tLB:ERR3242672\tSM:ER^C242672' > /scratch/mvasimud/sam_out/tmpv.sa

Here is the log (icpc-18):
-----------------------------
Executing in AVX512 mode!!
-----------------------------
* Ref file: /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa
* Entering FMI_search
* Reference seq len for bi-index = 6418915857
* Count:
0, 1
1, 1877984092
2, 3209457929
3, 4540931766
4, 6418915857

* Reading other elements of the index from files /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa
* Index prefix: /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa
* Read 0 ALT contigs
* Done reading Index!!
* Reading reference genome..
* Binary seq file = /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa.0123
* Reference genome size: 6418915856 bp
* Done reading reference genome !!

Memory pre-allocation for Chaining: 1393.3971 MB
Memory pre-allocation for BSW: 1916.9362 MB
Memory pre-allocation for BWT: 618.5134 MB

* Threads used (compute): 8
* No. of pipeline threads: 2

[0000] read_chunk: 100000000, work_chunk_size: 100000200, nseq: 666668
[0000][ M::kt_pipeline] read 666668 sequences (100000200 bp)...
[0000] Reallocating initial memory allocations!!
[0000] Calling mem_process_seqs.., task: 0
[0000] 1. Calling kt_for - worker_bwt
[0000] read_chunk: 100000000, work_chunk_size: 100000200, nseq: 666668
[0000][ M::kt_pipeline] read 666668 sequences (100000200 bp)...
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3209457928, n: 666668
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (0, 0, 0, 0)
[0000][PE] skip orientation FF as there are not enough pairs
[0000][PE] skip orientation FR as there are not enough pairs
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000] 3. Calling kt_for - worker_sam
[0000][ M::mem_process_seqs] Processed 666668 reads in 82.258 CPU sec, 10.655 real sec

[[Skipped the intermediate logs]]

[0000] Calling mem_process_seqs.., task: 1140
[0000] 1. Calling kt_for - worker_bwt
[0000] read_chunk: 100000000, work_chunk_size: 42700500, nseq: 284670
[0000][ M::kt_pipeline] read 284670 sequences (42700500 bp)...
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3209457928, n: 666668
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (0, 0, 0, 0)
[0000][PE] skip orientation FF as there are not enough pairs
[0000][PE] skip orientation FR as there are not enough pairs
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000] 3. Calling kt_for - worker_sam
[0000][ M::mem_process_seqs] Processed 666668 reads in 51.979 CPU sec, 6.602 real sec
[0000] Calling mem_process_seqs.., task: 1141
[0000] 1. Calling kt_for - worker_bwt
[0000] read_chunk: 100000000, work_chunk_size: 0, nseq: 0
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3209457928, n: 284670
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (0, 0, 0, 0)
[0000][PE] skip orientation FF as there are not enough pairs
[0000][PE] skip orientation FR as there are not enough pairs
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000] 3. Calling kt_for - worker_sam
[0000][ M::mem_process_seqs] Processed 284670 reads in 21.638 CPU sec, 2.706 real sec
[0000] read_chunk: 100000000, work_chunk_size: 0, nseq: 0
[0000] Computation ends..
No. of OMP threads: 8
Processor is runnig @2693.609926 MHz

yihchii · 2020-06-11T00:22:21Z

Hi @yuk12, with the latest binary (20 days old by now) we have, this job failed with Segmentation fault when I used up all threads on the instance.

Ran on AWS c5d.18xlarge (72 cores) with Linux 4.4.0-1099-aws and Ubuntu 16.04.6 LTS
Executing in AVX512 mode

See log and error message below:

$ bwa-mem2 mem -t 72 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/ERR3242672_1.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/ERR3242672_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242672_1\tPL:ILLUMINA\tPU:ERR3242672_1\tLB:ERR3242672\tSM:ERR3242672'
...
-----------------------------
Executing in AVX512 mode!!
-----------------------------
* Ref file: genome/hs38DH.fa
* Entering FMI_search
* Index file found. Loading index from genome/hs38DH.fa.bwt.8bit.32
* Reference seq len for bi-index = 6434693835
* Count:
0,	1
1,	1882204624
2,	3217346918
3,	4552489212
4,	6434693835
* Reading other elements of the index from files genome/hs38DH.fa
* Index prefix: genome/hs38DH.fa
* Read 3171 ALT contigs
* Done reading Index!!
* Reading reference genome..
* Binary seq file = genome/hs38DH.fa.0123
* Reference genome size: 6434693834 bp
* Done reading reference genome !!
...

[V] 97061139 59:98552699
[V] 97552442 01:00:59557999
[V] 98707864 01:01:36115999
[V] 99033930
[D]	md5	a35c658a7220c05b2c16255584710479
[V] checksum ok
[V] blocks merged in time 01:05:77533499
[V] run time 13:02:54596900 (782.546 s)	MemUsage(size=15814.6,rss=5763.01,peak=20143.8)
1550 Segmentation fault

When I lowered the number of threads o use to 36, the job finished sucessfully:

$ bwa-mem2 mem -t 36 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/ERR3242672_1.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/ERR3242672_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242672_1\tPL:ILLUMINA\tPU:ERR3242672_1\tLB:ERR3242672\tSM:ERR3242672' > test.sam
...
[0000][PE] (25, 50, 75) percentile: (361, 419, 490)
[0000][PE] low and high boundaries for computing mean and std.dev: (103, 748)
[0000][PE] mean and std.dev: (427.63, 96.85)
[0000][PE] low and high boundaries for proper pairs: (1, 877)
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000][PE] skip orientation FF
[0000] 3. Calling kt_for - worker_sam
        [0000][ M::mem_process_seqs] Processed 284670 reads in 51.176 CPU sec, 1.977 real sec
[0000] read_chunk: 100000000, work_chunk_size: 0, nseq: 0
[0000] Computation ends..
No. of OMP threads: 36
Processor is runnig @3000.173222 MHz
Runtime profile:

        Time taken for main_mem function: 6739.18 sec

        IO times (sec) :
        Reading IO time (reads) avg: 1270.00, (1270.00, 1270.00)
        Writing IO time (SAM) avg: 1055.70, (1055.70, 1055.70)
        Reading IO time (Reference Genome) avg: 4.14, (4.14, 4.14)
        Index read time avg: 31.98, (31.98, 31.98)

        Overall time (sec) (Excluding Index reading time):
        PROCESS() (Total compute time + (read + SAM) IO time) : 6698.43
        MEM_PROCESS_SEQ() (Total compute time (Kernel + SAM)), avg: 6695.87, (6695.87, 6695.87)

         SAM Processing time (sec):
        --WORKER_SAM avg: 2357.22, (2357.22, 2357.22)

        Kernels' compute time (sec):
        Total kernel (smem+sal+bsw) time avg: 4209.77, (4209.77, 4209.77)
                SMEM compute avg: 1487.40, (1505.20, 1480.82)
                SAL compute avg: 552.52, (558.28, 539.82)
                BSW time, avg: 1625.21, (1638.39, 1612.70)

        Total re-allocs: -1939826418 out of total requests: 1906125222, Rate: -1.02

Important parameter settings: 
        BATCH_SIZE: 512
        MAX_SEQ_LEN_REF: 256
        MAX_SEQ_LEN_QER: 128
        MAX_SEQ_LEN8: 128
        SEEDS_PER_READ: 500
        SIMD_WIDTH8 X: 64
        SIMD_WIDTH16 X: 32
        AVG_SEEDS_PER_READ: 64

You may close this ticket as there are workaround with the Segmentation fault, by lowering the number of threads to use.

yuk12 self-assigned this Feb 20, 2020

yuk12 closed this as completed Jun 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`bwa-mem2': free(): invalid next size (normal): 0x00007f3a1d66fb5 #49

`bwa-mem2': free(): invalid next size (normal): 0x00007f3a1d66fb5 #49

yihchii commented Feb 20, 2020 •

edited

yuk12 commented Apr 20, 2020

yihchii commented Jun 11, 2020

`bwa-mem2': free(): invalid next size (normal): 0x00007f3a1d66fb5 #49

`bwa-mem2': free(): invalid next size (normal): 0x00007f3a1d66fb5 #49

Comments

yihchii commented Feb 20, 2020 • edited

yuk12 commented Apr 20, 2020

yihchii commented Jun 11, 2020

yihchii commented Feb 20, 2020 •

edited