-
Notifications
You must be signed in to change notification settings - Fork 0
1. Usage
The Nextflow process must run until the pipeline is completed, it is therefore advisable to run the pipeline in the background. Nextflow automatically handles job submission (SLURM, PBD, etc.), so the pipeline can be used on a computing cluster.
LAGOON-MCL uses Nextflow as workflow manager, and Singlularity as container manager. If you're using it on a cluster, Nextflow handles job submission (SLURM, PBS, etc.).
Check that Nextflow and Apptainer are installed on your system.
Retrieve the files available on the GitHub repository.
git clone https://github.com/jroussea/lagoon-mcl.git
cd lagoon-mclThe containers used by LAGOON-MCL are available at BioContainers.
- Diamond (v2.1.10) \
- MCL (v22.282) \
- MMseqs2 (v15.6f452) \
- SeqKit2 (v2.9.0) (used only to build Pfam and AlphaFold databases)
A LAGOON-MCL-specific container is available at Docker Hub. It contains Python (v3.12.6) as well as the modules pandas (v2.2.3), numpy (v2.2.2), python-igraph (v0.11.8), biopython (v1.85), seaborn (v0.13.2) and jinja2 (v3.1.5). For more information on the container, and to view the dockerfile, please consult the containers/lagoon-mcl/1.0.0 directory. Unlike the four previous containers, you need to build the Singularity container from the Docker image using the : apptainer build --fakeroot [apptainer container] [docker image].
# SeqKit2 v2.9.0
wget -O containers/seqkit/2.9.0/seqkit.sif https://depot.galaxyproject.org/singularity/seqkit:2.9.0--h9ee0642_0
# Diamond v2.1.10
wget -O containers/diamond/2.1.10/diamond.sif https://depot.galaxyproject.org/singularity/diamond:2.1.10--h43eeafb_2
# MCL v22.282
wget -O containers/mcl/22.282/mcl.sif https://depot.galaxyproject.org/singularity/mcl:22.282--pl5321h031d066_2
# MMseqs2 v15.6f452
wget -O containers/mmseqs2/15.6f452/mmseqs.sif https://depot.galaxyproject.org/singularity/mmseqs2:15.6f452--pl5321h6a68c12_3
# LAGOON-MCL v1.0.0
apptainer build --fakeroot containers/lagoon-mcl/1.0.0/lagoon-mcl.sif docker://jroussea/lagoon-mcl:latestContainers can be found in: containers/
LAGOON-MCL uses the Pfam and AlphaFold Protein Structure Databases to obtain sequence information. 4. Databases détaille les méthodes utilisé pour télécharger les banques de données.
cd tool-kit/
# Download Alphafold Protein Database [mandatory]
./build_alpahfold_db.sh
# Dowload Pfam [optional]
./build_pfam_db.shDatabases are available in the database/ folder.
If you are using LAGOON-MCL on a computing cluster, you will need to provide Nextflow with a configuration file specific to your system. Information on executors (SLURM, PBS, AWS, ...) can be found in the executor section of the Nextflow documentation. For some institutes, this file is already referenced in nf-core/configs. If this is the case, you can download the file and use it with -c path/to/your/institute/config/file/institute_file.config when executing the pipeline.
nextflow run main.nf -profile singularity -params-file params_test.yamlThe files params_test.yaml and params.yaml contain the various parameters used for running LAGOON-MCL.
For more information on running the pipeline, please see the 3. Tutorial.
Retrieve the latest version of the pipeline from the GitHub repository.
cd lagoon-mcl/
git pullAll parameters are available here