You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.
mmseqs createdb test.faa test.db
mmseqs cluster test.db test.clu tmp --min-seq-id 0.9
test.faa is attached in the gist
Providing context helps us come up with a solution and improve our documentation for the future.
I am using mmseq2 for clustering of protein sequences in automatic pipeline, and I faced with the problem when mmseq2 fails to process fasta files that consist of repeated same sequences, like this:
record1
MALYNISEKILTTLEKTSFTIERLQERYDLQEAIKKNIDIVAPGCLVISEEFSDWEDSRR
record2
MALYNISEKILTTLEKTSFTIERLQERYDLQEAIKKNIDIVAPGCLVISEEFSDWEDSRR
...
Perhaps it would be worth to add handling of such fasta files.
Your Environment
Include as many relevant details about the environment you experienced the bug in.
Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters):
Which MMseqs version was used (Statically-compiled, self-compiled, Homebrew, etc.):
For self-compiled and Homebrew: Compiler and Cmake versions used and their invocation:
Server specifications (especially CPU support for AVX2/SSE and amount of system memory):
Operating system and version:
The text was updated successfully, but these errors were encountered:
Expected Behavior
Producing a cluster
Current Behavior
Error of clustering
Steps to Reproduce (for bugs)
Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.
mmseqs createdb test.faa test.db
mmseqs cluster test.db test.clu tmp --min-seq-id 0.9
test.faa is attached in the gist
MMseqs Output (for bugs)
Please make sure to also post the complete output of MMseqs. You can use gist.github.com for large output.
https://gist.github.com/matveykolesnik/43a90e7404e11881c29e2c80d79c5fec
Context
Providing context helps us come up with a solution and improve our documentation for the future.
I am using mmseq2 for clustering of protein sequences in automatic pipeline, and I faced with the problem when mmseq2 fails to process fasta files that consist of repeated same sequences, like this:
Your Environment
Include as many relevant details about the environment you experienced the bug in.
The text was updated successfully, but these errors were encountered: