Skip to content

Commit

Permalink
Adding conda/mamba install instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
skchronicles committed Apr 10, 2024
1 parent d0b8584 commit a2be3cd
Showing 1 changed file with 65 additions and 26 deletions.
91 changes: 65 additions & 26 deletions docs/setup.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
## Dependencies

!!! note inline end "Requirements"
**Using Singularity**: `singularity>=3.5` `snakemake>=6.0`
**Using Singularity**: `singularity>=3.5` `snakemake<8.0`

**Using Conda or Mamba**: `conda/mamba` `snakemake>=6.0`
**Using Conda or Mamba**: `conda/mamba` `snakemake<8.0`

[Snakemake](https://snakemake.readthedocs.io/en/stable/getting_started/installation.html) must be installed on the target system. Snakemake is a workflow manager that orchestrates each step of the pipeline. The second dependency, i.e [singularity](https://singularity.lbl.gov/all-releases) OR [conda/mamba](https://github.com/conda-forge/miniforge#mambaforge), handles the dowloading/installation of any remaining software dependencies.

By default, the pipeline will utilize singularity; however, the `--use-conda` option of the [run](usage/run.md) sub command can be provided to use conda/mamba instead of singularity. If possible, we recommend using singularity over conda for reproducibility; however, it is worth noting that singularity and conda produce identical results for this pipeline.

If you are running the pipeline on Windows, please use the [Windows Subsystem for Linux (WSL)](https://learn.microsoft.com/en-us/windows/wsl/install). Singularity can be installed on WSL following these [instructions](https://www.blopig.com/blog/2021/09/using-singularity-on-windows-with-wsl2/).
If you are running the pipeline on Windows, please use the [Windows Subsystem for Linux (WSL)](https://learn.microsoft.com/en-us/windows/wsl/install). Singularity can be installed on WSL following these [instructions](https://www.blopig.com/blog/2021/09/using-singularity-on-windows-with-wsl2/). With that being said, we recommend using conda/mamba over singularity. It is easier to setup and install via using WSL on Windows. Please see the instructions below to install conda/mamba on your system.

You can check to see if mpox-seek's software requirements are met by running:
```bash
Expand All @@ -20,6 +20,29 @@ which singularity \
|| which conda \
|| which mamba \
|| echo 'Error: singularity or conda or mamba are not installed.'

# Install conda/mamba if missing.
# Create directory to install
# everything in $HOME/pipelines.
mkdir -p "${HOME:-~}/pipelines/conda"
cd "${HOME:-~}/pipelines/conda"
# Download the installer for
# miniforge and install conda.
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
# Installs conda, a prompt will
# appear, hit ENTER, type yes a
# few times, and then enter no
# when it asks to initialize
# your shell/startup
bash "Miniforge3-$(uname)-$(uname -m).sh" -f -p $PWD

# Start up the conda environment,
# if conda/mamba are not installed
# follow the instructions above.
source "${HOME:-~}/pipelines/conda/etc/profile.d/conda.sh" || echo 'Error: conda/mamba is not installed!'
# Install snakemake if missing.
# Create environment with snakemake/7.32.3
mamba create -y -c conda-forge -c bioconda -n snakemake snakemake=7.32.3
```


Expand All @@ -30,11 +53,14 @@ Please ensure the software dependencies listed above are satisfied before gettin
You can install mpox-seek locally with the following command:
```bash
# Clone mpox-seek from Github
git clone https://github.com/OpenOmics/mpox-seek.git
mkdir -p "${HOME:-~}/pipelines/mpox-seek"
git clone https://github.com/OpenOmics/mpox-seek.git "${HOME:-~}/pipelines/mpox-seek"
# Change your working directory
cd mpox-seek/
# Get usage information
./mpox-seek -h
# Create an conda enviroment for mpox-seek
mamba env create -y --name mpox-seek --file=workflow/envs/mpox.yaml
```

## Offline mode
Expand All @@ -43,12 +69,25 @@ The `mpox-seek` pipeline can be run in an offline mode where external requests a

#### Download resource bundle

???+ note

This pipeline does not have any reference files that need to be downloaded
prior to running. As so, everything on this page can be safely ignored! We
have bundled all the reference files for the pipeline within our Github
repository. All the reference files are located within the [resources folder](https://github.com/OpenOmics/mpox-seek/tree/main/resources).

Please feel free to skip over this section. There are no reference files that need to be downloaded prior to running the pipeline. All reference files are bundled with the pipeline.

To download the pipeline's resource bundle, please run the following command:
```bash
# Dry-run download of the resource bundle
# Dry-run download of the resource bundle,
# this will show you what will be downloaded.
./mpox-seek install --ref-path /data/$USER/refs \
--force --threads 4 --dry-run
# Download the resource bundle

# Download the resource bundle, currently
# all resources are bundled with the pipeline
# so there is nothing to download!
./mpox-seek install --ref-path /data/$USER/refs \
--force --threads 4
```
Expand All @@ -70,14 +109,13 @@ Please remember the path provided to the `--sif-cache` option above, you will ne

#### Cache conda environment

This next step is only applicable to conda/mamba users. If you are using singularity instead of conda/mamba, you can skip over this section. By default, when the `--use-conda` option is
provided, a conda environment will be built on the fly. Building a conda environment can be slow, and it also makes exeternal requests so you will need internet access. With that being said, it may make sense to create/cache the conda environment once and re-use it. To cache/create mpox-seek's conda environment, please run the following command:
This next step is only applicable to conda/mamba users. If you are using singularity instead of conda/mamba, you can skip over this section. By default, when the `--use-conda` option is provided, a conda environment will be built on the fly. Building a conda environment can be slow, and it also makes exeternal requests so you will need internet access. With that being said, it may make sense to create/cache the conda environment once and re-use it. To cache/create mpox-seek's conda environment, please run the following command:
```bash
# Create a conda/mamba env
# called mpox-seek, you only
# need to run this once
# on your computer/cluster
mamba env create -f workflow/envs/mpox-seek.yaml
mamba env create -f workflow/envs/mpox.yaml
```

Running the command above will create a named conda/mamba environment called `mpox-seek`. Now you can provide `--conda-env-name mpox-seek` to the [run sub command](usage/run.md). This will ensure conda/mamba is run in an offline-like mode where no external requests are made at runtime. It will use the local, named conda environment instead of building a new environment on the fly.
Expand All @@ -100,19 +138,15 @@ Following the example below, please replace `--input .tests/*.gz` with your inpu
cd mpox-seek/
# Get usage information
./mpox-seek -h
# Download resource bundle
./mpox-seek install --ref-path $HOME/refs --force --threads 4
# Cache software containers
./mpox-seek cache --sif-cache $HOME/SIFs
# Dry run mpox-seek pipeline
./mpox-seek run --input .tests/*.gz --output tmp_01/ \
--resource-bundle $HOME/refs/mpox-seek \
--sif-cache $HOME/SIFs --mode local \
--dry-run
# Run mpox-seek pipeline
# in offline-mode
./mpox-seek run --input .tests/*.gz --output tmp_01/ \
--resource-bundle $HOME/refs/mpox-seek \
--sif-cache $HOME/SIFs --mode local
```

Expand All @@ -127,24 +161,29 @@ Following the example below, please replace `--input .tests/*.gz` with your inpu
cd mpox-seek/
# Get usage information
./mpox-seek -h
# Download resource bundle
./mpox-seek install --ref-path $HOME/refs --force --threads 4
# Add conda/mamba to $PATH
which conda \
|| source "${HOME:-~}/pipelines/conda/etc/profile.d/conda.sh" \
|| echo 'Error: conda/mamba is not installed.'
# Cache conda environment,
# creates a local conda env
# called mpox-seek
mamba env create -f workflow/envs/mpox-seek.yaml
# Dry run mpox-seek pipeline
mamba env create -f workflow/envs/mpox.yaml
# Activate snakemake conda environment
conda activate snakemake
# Dry run mpox-seek pipeline to
# see what steps/jobs will run
./mpox-seek run --input .tests/*.gz --output tmp_01/ \
--resource-bundle $HOME/refs/mpox-seek \
--mode local --conda-env-name mpox-seek \
--use-conda --dry-run
# Run mpox-seek pipeline
# with conda/mamba in
# offline-mode
--mode local --use-conda --conda-env-name mpox-seek \
--additional-strains resources/mpox_additional_strains.fa.gz \
--batch-id 2024-04-08 --bootstrap-trees --dry-run
# Run mpox-seek pipeline with conda/mamba in
# offline-mode, no external requests are made,
# all software dependencies are cached locally
./mpox-seek run --input .tests/*.gz --output tmp_01/ \
--resource-bundle $HOME/refs/mpox-seek \
--mode local --conda-env-name mpox-seek \
--use-conda
--mode local --use-conda --conda-env-name mpox-seek \
--additional-strains resources/mpox_additional_strains.fa.gz \
--batch-id 2024-04-08 --bootstrap-trees
```

=== "Biowulf"
Expand Down

0 comments on commit a2be3cd

Please sign in to comment.