Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: 'std::bad_alloc' #6

Closed
SwapnilDoijad opened this issue Feb 16, 2018 · 3 comments
Closed

Error: 'std::bad_alloc' #6

SwapnilDoijad opened this issue Feb 16, 2018 · 3 comments

Comments

@SwapnilDoijad
Copy link

SwapnilDoijad commented Feb 16, 2018

fastaANI ran properly with 100 genomes. However, increased to 1000 genomes resulted in the following error


Error details:


$ fastANI --ql 1000_genomes.list --rl 1000_genomes.list -o output.txt

Reference = [1.fasta, 2.fasta, ......... 1000.fasta]
Query = [1.fasta, 2.fasta, ......... 1000.fasta]
Kmer size = 16
Fragment length = 3000
ANI output file = /media/network/project_Lm_all/results/43_fastANI/output.txt

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped) fastANI --ql 1000_genomes.list --rl 1000_genomes.list -o output.txt


Hardware details


Processor | 8x Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Memory | 65858MB
Operating System | Ubuntu 16.04.3 LTS

@cjain7
Copy link
Member

cjain7 commented Feb 16, 2018

std::bad_alloc implies FastANI could not allocate memory when it needed. For 1000 microbial genomes, I expect the memory usage to be much below 66G. Please answer few follow up questions here:

  1. What is the total size of all genomes you have in in 1000_genomes.list? I wonder if it is too big.

  2. Can you provide memory usage of above run? It can be easily obtained by using the /usr/bin/time utility:

/usr/bin/time fastANI --ql 1000_genomes.list --rl 1000_genomes.list -o output.txt

Please make sure you are not running other memory intensive tasks on your system while doing this.

  1. Other thing you may want to try is split the reference list (--rl) 1000_genomes.list into two lists of 500 genomes to reduce the memory use and run them one by one as:

fastANI --ql 1000_genomes.list --rl 500_genomes_first.list -o output_1.txt
fastANI --ql 1000_genomes.list --rl 500_genomes_second.list -o output_2.txt
cat output_1.txt output_2.txt > output.txt

@SwapnilDoijad
Copy link
Author

SwapnilDoijad commented Feb 19, 2018

  1. 1000_genomes.list contains a list of 1000 genomes, each 3 Mb.

  2. After closing all other programs

(A) for 1000 genome

$ /usr/bin/time fastANI --ql 1000_genomes.list --rl 1000_genomes.list -o output.txt
Reference = [1.fasta, 2.fasta, ......... 1000.fasta]
Query = [1.fasta, 2.fasta, ......... 1000.fasta]
Kmer size = 16
Fragment length = 3000
ANI output file = output.txt
INFO, skch::Sketch::build, minimizers picked from reference = 305245397
INFO, skch::Sketch::index, unique minimizers = 7172419
INFO, skch::Sketch::computeFreqHist, Frequency histogram of minimizers = (1, 3440253) ... (529726, 1)
INFO, skch::Sketch::computeFreqHist, With threshold 0.001%, ignore minimizers occurring >= 2858 times during lookup.
INFO, skch::main, Time spent sketching the reference : 286.582 sec
INFO, skch::main, Time spent mapping fragments in query #1 : 412.583 sec
INFO, skch::main, Time spent post mapping : 20.5822 sec
Command terminated by signal 11
648.89user 7.07system 37:54.26elapsed 28%CPU (0avgtext+0avgdata 7770108maxresident)k
0inputs+0outputs (0major+979813minor)pagefaults 0swaps

(B) For 100 genome, each 3 Mb, output is..

successful run

and

7700.57user 1.39system 2:08:32elapsed 99%CPU (0avgtext+0avgdata 1661348maxresident)k
0inputs+3360outputs (0major+438435minor)pagefaults 0swaps

  1. creating a bash loop or splitting was the final solution, worked very well.

@cjain7
Copy link
Member

cjain7 commented Feb 20, 2018

Thanks for sharing the info. I tried creating a custom dataset of 1000 E coli genomes at my end but could not reproduce above issue. Let me know if the data you are using is public.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants