-
Notifications
You must be signed in to change notification settings - Fork 9
03. Offline Setup
This setup is crucial to successfully execute the MetScale pipeline in an air-gapped system. It is assumed that you have completed the Install directions, including the creation and activation of your metscale environment.
NOTE: If you do not wish to run this workflow offline on an air-gapped system, instructions to run the workflows online can be found on the FAQs page.
When running the Offline Setup, you may specify which workflow files and dependencies to download, or you may choose to download all of the workflow files and dependencies at once (see the Workflow Setup Options table below).
NOTE: The offline setup command must be executed from the metscale/workflows/
directory.
[user@localhost ~]$ conda activate metscale
(metscale)[user@localhost ~]$ cd metscale/workflows
python download_offline_files.py --workflow {workflow_setup_options}
Setup Option | Description |
---|---|
test_files |
Downloads the Shakya subset 10 datasets |
read_filtering |
Copies the adapter file to the data directory, downloads the biocontainers needed for the read filtering workflow, and creates singularity images |
assembly |
Downloads biocontainers needed for the assembly workflow and creates singularity images |
comparison |
Downloads biocontainer needed for the metagenome comparison workflow and creates singularity image |
taxonomic_classification |
Downloads all the biocontainers needed for tools within the taxonomic classification workflow and creates singularity images (note: this does not download the databases needed to run sourmash and kaiju, see sourmash_db and kaiju_db setup options |
sourmash_db |
Downloads only the sourmash databases |
kaiju_db |
Downloads only the kaiju database |
mtsv_db |
Downloads only the default MTSv database |
functional_inference |
Downloads the databases and biocontainers needed for the functional inference workflow and creates singularity images |
all |
Downloads all files and biocontainers needed for all workflows and creates all singularity images |
If you have installed a previous version of the workflows, there are new files to download to run v1.4. There are also version updates to existing tools that need to be installed, so be sure to follow those steps on the Install wiki page.
With the Singularity version updates since v1.2, the workflow biocontainers no longer run with *.simg
images, but with *.sif
images instead. To run workflows with the new Singularity version, the offline setup commands should be run to re-download all of the singularity containers as *.sif
images. The older *.simg
images from v1.2 can be deleted from the metscale/container_images/
directory with the following command:
rm -f *.simg
Once you have successfully finished the setup, all of the files and images will be ready to proceed with executing the workflows offline.
In order to proceed to the Read Filtering workflow in an offline environment, you should have run the workflow setup with either 1) the test_files and the read_filtering flags or 2) the all flag.
IMPORTANT: If you did not download all of the workflow files and dependencies, please keep in mind that the workflows are executed in a specific order (i.e., read filtering -> assembly -> comparison -> taxonomic classification -> functional inference -> post processing). It is recommended that users run the example dataset through the workflows in that order to learn how everything operates, and the workflows are described in that order throughout subsequent pages of this wiki. The subsequent wiki pages will walk through each workflow in a step by step process using the example dataset and default config files, and instructions are provided to describe how to run a user's own samples through the workflow.