Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastANI still not fast enough? #10

Closed
biofuture opened this issue Apr 26, 2018 · 1 comment
Closed

FastANI still not fast enough? #10

biofuture opened this issue Apr 26, 2018 · 1 comment

Comments

@biofuture
Copy link

biofuture commented Apr 26, 2018

Dear Sir

I tried your fastANI to generate ANI to about 2000 genomes; the speed is quite slow. I run the program on a super node with 64 cores and 500 Gb memory. The software can only run in one single thread.

I know that you have already supplied a script to split genomes into smaller parts. But in one node, the speed is limited by the IO transfer if I run it parallel in one hard disk.

How did you generate the ANI among 80000 genomes? Can you give me some hint?

I tried to run it on our HPCF; however, for every single run, the memory requirements exceed 96 Gbs which is the configuration in most of our node.

I can only submit limited jobs (10) at one time, so I can just split the total jobs into less than 100 jobs rather than 1000 of jobs.

Thank you very much!

Xiaotao

@cjain7
Copy link
Member

cjain7 commented Apr 30, 2018

Generating ANI for 2000 genomes should be pretty quick and should take less than 96G. In my latest run with 8000 genomes, FastANI used about 60G memory.

Could you double check your scripts that split the reference DB and call FastANI? Also see #6 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants