You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly thanks for Semibin(2) - it works amazingly well, so many bins recovered compared to other binning methods :)
I want to share some feedback regarding database download and Semibin's documentation.
The HPC cluster I use at my institution blocks internet access on compute nodes. Therefore, lazily downloading the Semibin2 database did not work when I ran the below command (Semibin v1.5.1, Linux installation via bioconda).
It was difficult for me to figure out that this was in fact the error, because a database isn't mentioned in the readme and only in the FAQs of the docs, and the error message wasn't informative (apologies I have overwritten the log file or I would quote it).
I then tried following the FAQs of the docs to download the updated GTDB database, the following does not work in MMseqs2 v13.45111 (with this known MMSeqs2 error soedinglab/MMseqs2#561)
mmseqs databases GTDB GTDB tmp
Then, after looking at the Semibin codebase I was able to install the database manually:
wget 'https://zenodo.org/record/4751564/files/GTDB_v95.tar.gz?download=1'
mv GTDB_v95.tar.gz?download=1 GTDB_v95.tar.gz
tar -xzvf GTDB_v95.tar.gz
and went from there, specifying -r {params.db} and then semibin worked perfectly.
So perhaps either including a specific --download_database flag or script, or just documenting a manual install method would help future users like me without compute node internet access.
George
The text was updated successfully, but these errors were encountered:
Gday @psj1997 @luispedro and other Semibin developers,
Firstly thanks for Semibin(2) - it works amazingly well, so many bins recovered compared to other binning methods :)
I want to share some feedback regarding database download and Semibin's documentation.
The HPC cluster I use at my institution blocks internet access on compute nodes. Therefore, lazily downloading the Semibin2 database did not work when I ran the below command (Semibin v1.5.1, Linux installation via bioconda).
It was difficult for me to figure out that this was in fact the error, because a database isn't mentioned in the readme and only in the FAQs of the docs, and the error message wasn't informative (apologies I have overwritten the log file or I would quote it).
I then tried following the FAQs of the docs to download the updated GTDB database, the following does not work in MMseqs2 v13.45111 (with this known MMSeqs2 error soedinglab/MMseqs2#561)
Then, after looking at the Semibin codebase I was able to install the database manually:
and went from there, specifying
-r {params.db}
and then semibin worked perfectly.So perhaps either including a specific
--download_database
flag or script, or just documenting a manual install method would help future users like me without compute node internet access.George
The text was updated successfully, but these errors were encountered: