Skip to content

03. Offline Setup

Krista Ternus edited this page Nov 11, 2020 · 18 revisions

Offline Setup

Table of Contents

Overview

This setup is crucial to successfully execute the MetScale pipeline in an air-gapped system. It is assumed that you have completed the Install directions, including the creation and activation of your metscale environment.

NOTE: If you do not wish to run this workflow offline on an air-gapped system, instructions to run the workflows online can be found on the FAQs page.

Offline Setup Process

When running the Offline Setup, you may specify which workflow files and dependencies to download, or you may choose to download all of the workflow files and dependencies at once (see the Workflow Setup Options table below).

NOTE: The offline setup command must be executed from the metscale/workflows/ directory.

[user@localhost ~]$ conda activate metscale 

(metscale)[user@localhost ~]$ cd metscale/workflows

Offline Setup Command

python download_offline_files.py --workflow {workflow_setup_options}

Offline Setup Options

Setup Option Description
test_files Downloads the Shakya subset 10 datasets
read_filtering Copies the adapter file to the data directory, downloads the biocontainers needed for the read filtering workflow, and creates singularity images
assembly Downloads biocontainers needed for the assembly workflow and creates singularity images
comparison Downloads biocontainer needed for the metagenome comparison workflow and creates singularity image
taxonomic_classification Downloads all the biocontainers needed for tools within the taxonomic classification workflow and creates singularity images (note: this does not download the databases needed to run sourmash and kaiju, see sourmash_db and kaiju_db setup options
sourmash_db Downloads only the sourmash databases
kaiju_db Downloads only the kaiju database
mtsv_db Downloads only the default MTSv database
functional_inference Downloads the databases and biocontainers needed for the functional inference workflow and creates singularity images
all Downloads all files and biocontainers needed for all workflows and creates all singularity images

If you have installed a previous version of the workflows, there are new files to download to run v1.4. There are also version updates to existing tools that need to be installed, so be sure to follow those steps on the Install wiki page.

With the Singularity version updates since v1.2, the workflow biocontainers no longer run with *.simg images, but with *.sif images instead. To run workflows with the new Singularity version, the offline setup commands should be run to re-download all of the singularity containers as *.sif images. The older *.simg images from v1.2 can be deleted from the metscale/container_images/ directory with the following command:

rm -f *.simg

Once you have successfully finished the setup, all of the files and images will be ready to proceed with executing the workflows offline.

Quick Check Before Proceeding

In order to proceed to the Read Filtering workflow in an offline environment, you should have run the workflow setup with either 1) the test_files and the read_filtering flags or 2) the all flag.

IMPORTANT: If you did not download all of the workflow files and dependencies, please keep in mind that the workflows are executed in a specific order (i.e., read filtering -> assembly -> comparison -> taxonomic classification -> functional inference -> post processing). It is recommended that users run the example dataset through the workflows in that order to learn how everything operates, and the workflows are described in that order throughout subsequent pages of this wiki. The subsequent wiki pages will walk through each workflow in a step by step process using the example dataset and default config files, and instructions are provided to describe how to run a user's own samples through the workflow.