Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Precluster method is taking too much time #323

Open
snashraf opened this Issue Apr 5, 2017 · 5 comments

Comments

Projects
None yet
2 participants

snashraf commented Apr 5, 2017 edited

Hello Friends,

I have started using mothur recently for some of the 16sRNA. I am stuck with the problem that pre.cluster(fasta=fileList.paired.trim.contigs.good.unique.good.filter.unique.fasta, count=fileList.paired.trim.contigs.good.good.count_table, processors=8) steps is taking too much time. Its been taking 4-5 days and still running even for 3-4 samples.
Can someone please help me on this? If this is normal with everyone else as well? how to tackle this issue?

Thanks
Najeeb

Contributor

mothur-westcott commented Apr 6, 2017

What version are you running? Could this be a memory issue? The more processors you use the more resources are required. Also, by default diffs=1. We recommend 1 diff per 100 bp.

snashraf commented Apr 6, 2017

Contributor

mothur-westcott commented Apr 6, 2017

pre.cluster(fasta=fileList.paired.trim.contigs.good.unique.good.filter.unique.fasta, count=fileList.paired.trim.contigs.good.good.count_table, diffs=2, processors=??)

You may not be able to use all available processors depending on the number of sequences. Perhaps try 2 or 4?

I was able to run finally but it's taking too much time. I have run this on few samples ( only 4 samples ) but I have to run this over few hundred sample(s). I am not sure how to use Mothur on more samples.

Contributor

mothur-westcott commented Apr 18, 2017

Are you using our latest version? https://github.com/mothur/mothur/releases How many sequences in total? Did you have good overlap when screening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment