Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: "Memory Error: MACS2.IO.FixWidthTrack.FWTrack.get_rlengths" #313

Closed
carl- opened this issue Sep 17, 2019 · 10 comments
Closed

Bug: "Memory Error: MACS2.IO.FixWidthTrack.FWTrack.get_rlengths" #313

carl- opened this issue Sep 17, 2019 · 10 comments
Assignees

Comments

@carl-
Copy link

carl- commented Sep 17, 2019

Hi, tao,
recently I ran macs2 callpeak on a genome with ~50K contigs, but macs2 aborted with the fowlling message. Is there any way to resolve this problem.

INFO @ Sun, 15 Sep 2019 22:32:43: #3 Call peaks...
INFO @ Sun, 15 Sep 2019 22:32:43: #3 Pre-compute pvalue-qvalue table...
Traceback (most recent call last):
File "/usr/local/bin/macs2", line 622, in
main()
File "/usr/local/bin/macs2", line 57, in main
run( args )
File "/usr/local/lib/python2.7/site-packages/MACS2/callpeak_cmd.py", line 264, in run
peakdetect.call_peaks()
File "MACS2/PeakDetect.pyx", line 121, in MACS2.PeakDetect.PeakDetect.call_peaks (MACS2/PeakDetect.c:1856)
File "MACS2/PeakDetect.pyx", line 393, in MACS2.PeakDetect.PeakDetect.__call_peaks_wo_control (MACS2/PeakDetect.c:5182)
File "MACS2/IO/CallPeakUnit.pyx", line 873, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_peaks (MACS2/IO/CallPeakUnit.c:12626)
File "MACS2/IO/CallPeakUnit.pyx", line 898, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_peaks (MACS2/IO/CallPeakUnit.c:11962)
File "MACS2/IO/CallPeakUnit.pyx", line 668, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__cal_pvalue_qvalue_table (MACS2/IO/CallPeakUnit.c:9249)
File "MACS2/IO/CallPeakUnit.pyx", line 497, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__pileup_treat_ctrl_a_chromosome (MACS2/IO/CallPeakUnit.c:7252)
File "MACS2/IO/FixWidthTrack.pyx", line 949, in MACS2.IO.FixWidthTrack.FWTrack.pileup_a_chromosome (MACS2/IO/FixWidthTrack.c:15953)
File "MACS2/IO/FixWidthTrack.pyx", line 966, in MACS2.IO.FixWidthTrack.FWTrack.pileup_a_chromosome (MACS2/IO/FixWidthTrack.c:15375)
File "MACS2/IO/FixWidthTrack.pyx", line 218, in MACS2.IO.FixWidthTrack.FWTrack.get_rlengths (MACS2/IO/FixWidthTrack.c:4487)
MemoryError

@taoliu taoliu self-assigned this Sep 17, 2019
@taoliu
Copy link
Contributor

taoliu commented Sep 17, 2019

@carl- Happy to help! Could you share the full runtime message? And which version of MACS2 did you use?

@carl-
Copy link
Author

carl- commented Sep 17, 2019

@carl- Happy to help! Could you share the full runtime message? And which version of MACS2 did you use?
the macs2 version used is macs2 2.1.2
the full runtime message is as fowllowing:

INFO @ Sun, 15 Sep 2019 22:32:14:

Command line: callpeak -t 24-L-1.bed -f BED -n 24-L-1 -g 710114939 --nomodel --shift -75 --extsize 150 -B --SPMR --keep-dup=all

ARGUMENTS LIST:

name = 24-L-1

format = BED

ChIP-seq file = ['24-L-1.bed']

control file = None

effective genome size = 7.10e+08

band width = 300

model fold = [5, 50]

qvalue cutoff = 5.00e-02

The maximum gap between significant sites is assigned as the read length/tag size.

The minimum length of peaks is assigned as the predicted fragment length "d".

Larger dataset will be scaled towards smaller dataset.

Range for calculating regional lambda is: 10000 bps

Broad region calling is off

Paired-End mode is off

MACS will save fragment pileup signal per million reads

INFO @ Sun, 15 Sep 2019 22:32:14: #1 read tag files...
INFO @ Sun, 15 Sep 2019 22:32:14: #1 read treatment tags...
INFO @ Sun, 15 Sep 2019 22:32:17: 1000000
INFO @ Sun, 15 Sep 2019 22:32:19: 2000000
INFO @ Sun, 15 Sep 2019 22:32:22: 3000000
INFO @ Sun, 15 Sep 2019 22:32:28: 4000000
INFO @ Sun, 15 Sep 2019 22:32:34: 5000000
INFO @ Sun, 15 Sep 2019 22:32:43: #1 tag size is determined as 142 bps
INFO @ Sun, 15 Sep 2019 22:32:43: #1 tag size = 142.0
INFO @ Sun, 15 Sep 2019 22:32:43: #1 total tags in treatment: 5978042
INFO @ Sun, 15 Sep 2019 22:32:43: #1 finished!
INFO @ Sun, 15 Sep 2019 22:32:43: #2 Build Peak Model...
INFO @ Sun, 15 Sep 2019 22:32:43: #2 Skipped...
INFO @ Sun, 15 Sep 2019 22:32:43: #2 Use 150 as fragment length
INFO @ Sun, 15 Sep 2019 22:32:43: #2 Sequencing ends will be shifted towards 5' by 75 bp(s)
INFO @ Sun, 15 Sep 2019 22:32:43: #3 Call peaks...
INFO @ Sun, 15 Sep 2019 22:32:43: #3 Pre-compute pvalue-qvalue table...
Traceback (most recent call last):
File "/local_data1/work/zhuwei/local/miniconda2/bin/macs2", line 622, in
main()
File "/local_data1/work/zhuwei/local/miniconda2/bin/macs2", line 57, in main
run( args )
File "/local_data1/work/zhuwei/local/miniconda2/lib/python2.7/site-packages/MACS2/callpeak_cmd.py", line 264, in run
peakdetect.call_peaks()
File "MACS2/PeakDetect.pyx", line 121, in MACS2.PeakDetect.PeakDetect.call_peaks (MACS2/PeakDetect.c:1856)
File "MACS2/PeakDetect.pyx", line 393, in MACS2.PeakDetect.PeakDetect.__call_peaks_wo_control (MACS2/PeakDetect.c:5182)
File "MACS2/IO/CallPeakUnit.pyx", line 873, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_peaks (MACS2/IO/CallPeakUnit.c:12626)
File "MACS2/IO/CallPeakUnit.pyx", line 898, in MACS2.IO.CallPeakUnit.CallerFromAlignments.call_peaks (MACS2/IO/CallPeakUnit.c:11962)
File "MACS2/IO/CallPeakUnit.pyx", line 668, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__cal_pvalue_qvalue_table (MACS2/IO/CallPeakUnit.c:9249)
File "MACS2/IO/CallPeakUnit.pyx", line 497, in MACS2.IO.CallPeakUnit.CallerFromAlignments.__pileup_treat_ctrl_a_chromosome (MACS2/IO/CallPeakUnit.c:7252)
File "MACS2/IO/FixWidthTrack.pyx", line 949, in MACS2.IO.FixWidthTrack.FWTrack.pileup_a_chromosome (MACS2/IO/FixWidthTrack.c:15953)
File "MACS2/IO/FixWidthTrack.pyx", line 966, in MACS2.IO.FixWidthTrack.FWTrack.pileup_a_chromosome (MACS2/IO/FixWidthTrack.c:15375)
File "MACS2/IO/FixWidthTrack.pyx", line 218, in MACS2.IO.FixWidthTrack.FWTrack.get_rlengths (MACS2/IO/FixWidthTrack.c:4487)
MemoryError

@taoliu
Copy link
Contributor

taoliu commented Sep 18, 2019

Thanks @carl- !

It seems that when we create a python dictionary for 50k contigs using the following code, MemoryError occurs.

self.rlengths = dict([(k, INT_MAX) for k in self.__locations.keys()])

Could you share with me the names of the 50k contigs so that I can try to see how much memory this python dictionary will take?

@taoliu taoliu changed the title Memory Error: MACS2.IO.FixWidthTrack.FWTrack.get_rlengths Bug: "Memory Error: MACS2.IO.FixWidthTrack.FWTrack.get_rlengths" Sep 18, 2019
@carl-
Copy link
Author

carl- commented Sep 18, 2019

Thanks @carl- !
It seems that when we create a python dictionary for 50k contigs using the following code, MemoryError occurs.
self.rlengths = dict([(k, INT_MAX) for k in self.__locations.keys()])
Could you share with me the names of the 50k contigs so that I can try to see how much memory this python dictionary will take?

Hi, tao
thanks for your attention. the contig names are attached
24-L-1.contig.gz

@taoliu
Copy link
Contributor

taoliu commented Sep 18, 2019

@carl- Thanks! I tried to create a dictionary with the contig names, and it only required small amount of memory. Could you share with me the BED file so that I can reproduce the error? Check my profile to see my email address.

@taoliu
Copy link
Contributor

taoliu commented Sep 19, 2019

@carl- I totally forgot I implemented this option in "callpeak" to solve such issue :(

$ macs2 callpeak -h
...
Other options:
  --buffer-size BUFFER_SIZE
                        Buffer size for incrementally increasing internal
                        array size to store reads alignment information. In
                        most cases, you don't have to change this parameter.
                        However, if there are large number of
                        chromosomes/contigs/scaffolds in your alignment, it's
                        recommended to specify a smaller buffer size in order
                        to decrease memory usage (but it will take longer time
                        to read alignment files). Minimum memory requested for
                        reading an alignment file is about # of CHROMOSOME *
                        BUFFER_SIZE * 2 Bytes. DEFAULT: 100000

To set this option to 1000 will significantly decrease the memory requirement, although it will slow down the whole process. Note that you should only tweak this when you have a large number of contigs.

@taoliu
Copy link
Contributor

taoliu commented Sep 19, 2019

This option was implemented for 2.0.10 back in 2013... No wonder I totally forgot it. I will add this option to other subcommands and let MACS2 report the memory issue together with suggestion on tweaking the buffer_size to users.

2013-09-15  Tao Liu  <vladimir.liu@gmail.com>
	MACS version 2.0.10 20130915 (tag:alpha)

	* callpeak Added a new option --buffer-size

	This option is to tweak a previously hidden parameter that
	controls the steps to increase array size for storing alignment
	information. While in some rare cases, the number of
	chromosomes/contigs/scaffolds is huge, the original default
	setting will cause a huge memory waste. In these cases, we
	recommend to decrease --buffer-size (e.g., 1000) to save memory,
	although the decrease will slow process to read alignment files.

taoliu added a commit that referenced this issue Sep 19, 2019
…le, and refinepeak. Catch MemoryError and give suggestion to users. Update testing script to run 50k contigs file. Decrease filesize of the testing files of BAMPE and BEDPE.
@taoliu taoliu pinned this issue Sep 19, 2019
@carl-
Copy link
Author

carl- commented Sep 20, 2019

nice to know this --buffer-size option, I will run macs2 with this option and if issue is still there, i will reopen this issue.
BTW, is there any plan for a new release ?

@taoliu
Copy link
Contributor

taoliu commented Sep 20, 2019 via email

@carl-
Copy link
Author

carl- commented Sep 20, 2019

This issue is resolved by adjusting the option '--buffer-size'.
thanks, tao

@carl- carl- closed this as completed Sep 20, 2019
@taoliu taoliu unpinned this issue Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants