-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is classify-consensus-vsearch so slow? #80
Comments
The plot thickens. It would appear that longish sequences make vsearch very slow. That may not be surprising, given that we are performing global alignments. I have attached the data files here and here. This command works ok, with an average sequence length of 230.
This command does not work ok, with an average sequence length of 1428. Note that I got sick of waiting and killed it when it was only 30% complete after more than 10 hours.
|
@BenKaehler can we close this issue? It seems that we resolved this — vsearch (unlike, say, BLAST+) experiences a dramatic runtime increase when very long sequences are used. However, vsearch works fine (with runtimes approximately equivalent to BLAST+) when amplicon sequences are used. The issue is with vsearch itself, not with how we're wrapping it. |
@BenKaehler @nbokulich is this safe to close? |
Yes, I think it is safe to close. It is not, in any case, a bug — vsearch just performs much slower on full-length sequences. I don't have write privileges so cannot close. Thanks @jairideout ! |
classify-consensus-vsearch
was 50 times slower thanclassify-consensus-blast
in the run time analysis for the paper. The users have noticed.We should double check that there isn't anything strange going on.
The text was updated successfully, but these errors were encountered: