Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exceeding defined RAM limits #266

Closed
mr-c opened this issue Jan 20, 2014 · 5 comments
Closed

exceeding defined RAM limits #266

mr-c opened this issue Jan 20, 2014 · 5 comments
Labels
Milestone

Comments

@mr-c
Copy link
Contributor

mr-c commented Jan 20, 2014

Reported by Julia Oh (see http://lists.idyll.org/pipermail/khmer/2013-December/000219.html)

"Starting with a fairly large file (estimated ~872400000 reads, ~185GB Illumina data):

I’m running the following command on a large memory machine. From what I understand, the first normalization step should be consuming 240GB RAM and it does:

$python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 20 -k 20 -N 4 -x 60e9 --savehash round2.unaligned_ref.kh -R round2.unaligned_1.report round2.unaligned;

Seems to end on removing ~33% of the reads, making ~118GB of sequence data

tail round2.unaligned_1.report
871500000 584890641 0.67113097074
871600000 584966095 0.671140540385
871700000 585039359 0.671147595503
871800000 585109434 0.671150991053
871900000 585174062 0.671148138548
872000000 585244067 0.671151452982
872100000 585314163 0.671154871001
872200000 585388191 0.671162796377
872300000 585459804 0.671167951393
872400000 585529439 0.671170837918

Then I do the filtering step which seems to run OK and makes the file a lot smaller, to about 54GB data.
$python2.7 /home/ohjs/khmer/scripts/filter-abund.py round2.unaligned_ref.kh round2.unaligned.keep;

Then I have a second normalization step:

$python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 5 -k 20 -N 4 -x 16e9 round2.unaligned.keep.abundfilt;

I thought I would be maxing out at 64 GB ram for the hash table (I’ve also used 32e9), but I get the following RAM usage report of

4986693.biobos elapsed time: 23358 seconds
4986693.biobos walltime: 06:28:36 hh:mm:ss
4986693.biobos memory limit: 249.00 GB
4986693.biobos memory used: 249.76 GB
4986693.biobos cpupercent used: 98.00 %

around read 299200000, and then my job gets killed for exceeding memory allocation."

She did a git checkout master and re-ran.

"Results are in and the error reproduced:

The following commands yield:
python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 20 -k 20 -N 4 -x 60e9 --savehash round2.unaligned_ref.kh -R round2.unaligned_1.report round2.unaligned;
python2.7 /home/ohjs/khmer/scripts/filter-abund.py round2.unaligned_ref.kh round2.unaligned.keep;
python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 5 -k 20 -N 4 -x 16e9 round2.unaligned.keep.abundfilt;

This last command yields:

... kept 116741181 of 151000000 or 77%
... in file round2.unaligned.keep.abundfilt
... kept 116816167 of 151100000 or 77%
... in file round2.unaligned.keep.abundf-------- running PBS epilogue script (5081978.biobos p78 ohjs) --------

Show some job stats:

5081978.biobos elapsed time: 9485 seconds
5081978.biobos walltime: 02:37:52 hh:mm:ss
5081978.biobos memory limit: 69.14 GB
5081978.biobos memory used: 69.16 GB
5081978.biobos cpupercent used: 98.00 %"

@ctb
Copy link
Member

ctb commented Jan 26, 2014

At our next developer meeting (Wed 2/5?) let's brainstorm about ways this could be happening in our codebase.

@RamRS
Copy link
Contributor

RamRS commented Jan 26, 2014

Is there any way I could join this meeting please?

Ram

On Sat, Jan 25, 2014 at 10:20 PM, C. Titus Brown
notifications@github.comwrote:

At our next developer meeting (Wed 2/5?) let's brainstorm about ways this
could be happening in our codebase.


Reply to this email directly or view it on GitHubhttps://github.com//issues/266#issuecomment-33308018
.

@ctb
Copy link
Member

ctb commented Jan 26, 2014

On Sat, Jan 25, 2014 at 08:11:14PM -0800, Ram RS wrote:

Is there any way I could join this meeting please?

Not easily :(. Right now we aren't well instrumented for this.

Chat with @mr-c and see if he has any suggestions.

--t

@mr-c mr-c mentioned this issue Mar 25, 2014
@mr-c mr-c modified the milestones: 1.1+ Release, 1.0 release Apr 2, 2014
@mr-c mr-c modified the milestones: 1.1.1+ Release, 1.1 + 2 Aug 1, 2014
@mr-c mr-c modified the milestones: 1.4+, 1.4 May 13, 2015
@mr-c
Copy link
Contributor Author

mr-c commented Sep 4, 2015

@ctb Now that we've reworked the memory usage and switched to --max-memory-usage do you think this is resolved? I don't have a reproducible test case.

@ctb
Copy link
Member

ctb commented Sep 4, 2015

On Fri, Sep 04, 2015 at 09:52:51AM -0700, Michael R. Crusoe wrote:

@ctb Now that we've reworked the memory usage and switched to --max-memory-usage do you think this is resolved? I don't have a reproducible test case.

It's a rare enough bug that I think we need to see it again in the wild
before we worry anyone about it. So +1 for closing.

@mr-c mr-c closed this as completed Sep 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants