Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory consumption of MSMC #11

Closed
hhu1 opened this issue May 15, 2015 · 2 comments
Closed

memory consumption of MSMC #11

hhu1 opened this issue May 15, 2015 · 2 comments
Labels

Comments

@hhu1
Copy link

hhu1 commented May 15, 2015

I was running MSMC on 4 genomes (8 haplotypes) from the same human population. I was particularly interested the demographic history in the last 30,000 years. After converting the data into appropriate input file of MSMC, it is around 300 MB.

The issue is that MSMC is consuming too much memory now. On our 256GB server, it rapidly took all the memory and got terminated by the system daemon. If I instead run on chr22 only (4.1MB input file), it runs fine but took 4 GB RAM.

My question is that, is there a way I can reduce the memory usage and still get a decent resolution on demographic history in the last 30,000 years? If not, would you recommend reducing the number of chromosomes or the number of haplotypes included in the MSMC run? I.e., which option compromises the accuracy of the results least?

Thanks very much for your help,
Hao Hu

@stschiff
Copy link
Owner

Try to run it first on 4 haplotypes, using the -I flag to pick the haplotypes you want. With 8 haplotypes and two subpopulations it uses quite a bit of memory, but I have usually found 80Gb to be enough. How many cores are you running in parallel? I use 10 usually. With more cores you need more memory.

@hhu1
Copy link
Author

hhu1 commented May 19, 2015

I used the default -t. It looks that if I set -t to 10, MSMC is using 75 Gb and runs well so far. Thanks for the help!

Best,
Hao

On May 19, 2015, at 3:27 AM, Stephan Schiffels notifications@github.com wrote:

Try to run it first on 4 haplotypes, using the -I flag to pick the haplotypes you want. With 8 haplotypes and two subpopulations it uses quite a bit of memory, but I have usually found 80Gb to be enough. How many cores are you running in parallel? I use 10 usually. With more cores you need more memory.


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants