Skip to content

Installation

KijinKim edited this page May 27, 2023 · 64 revisions

NOTE! VirPipe is intended to run on Linux and Mac OS. It is highly likely not to run on Windows.

The requirements of VirPipe are Conda, Git, Docker, and Git LFS(Only for zoonotic rank).

Quick Start

git clone --recurse-submodules https://github.com/KijinKims/virpipe
cd virpipe/build_image
docker build -t virpipe --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) . # can take ~1hr. recommend downloading databases while building.
cd ..
sh download_database.sh # can take several hours
conda create -n virpipe -c skkujin virpipe
conda activate virpipe

Overview

The installation of VirPipe consists of the steps as follows:

Clone the git repository

Build Docker image

Create the Conda environment

Download databases

Install python wrapper

Each step only has very few commands to execute. You could make it just by following the instructions as described here.

Clone the git repository

You need to clone the VirPipe git repository to install VirPipe.

You can clone the repo by typing the commands below.

git clone --recurse-submodules https://github.com/KijinKims/virpipe
cd virpipe

If Git LFS is installed on your computer, large files used by zoonotic rank will be cloned too. (About 40MB) If you don't use zoonotic rank, you can simply omit the --recurse-submodules option.

git clone https://github.com/KijinKims/virpipe

If you already cloned before but want to update it

git fetch
git pull origin master

Build Docker image

Note! Docker should be installed, and the user must have permission to access Docker. You can test the installation with hello-world official image.

docker run hello-world

Try the command below if you have already installed Docker but can't access it.

newgrp docker

VirPipe consists of a docker container and a wrapper program that forwards the parameters to the nextflow scripts contained in the container. A docker image should be built, and the containers instantiated from this image will work as workspaces where every task will be done. You can build the image with the commands below. This step can take up to 1hr.

cd build_image
docker build -t virpipe --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) .

Download databases

You could download ready-made databases by executing the bash script in the home folder of the cloned repository.

sh download_database.sh

This script downloads ref_viruses_rep_genomes database for BLAST, k2_viral database for Kraken2, a custom database built with refseq viral sequences for Centrifuge, and a database for taxonomizr. Last two are fetched from Zenodo repository.

However, from experience, manual download through a browser is faster and more stable. So you could do it manually via the links below if the script above doesn't work well.

The current list of databases is shown below. We are looking for the way to reduce the size of taxonomizr database by limiting the taxons to viruses.

Name Compressed size (GB) Decompressed size (GB) Included taxons
Kraken2 0.4 0.5 Refseq_viral
BLAST 0.1 0.1 ref_viruses_rep_genomes
Centrifuge 0.1 0.2 Refseq_viral
Taxonomizr 6.5 36 All
Total 7.3 36.9

Install python wrapper

Even though the built docker image can be used directly, a python wrapper is provided for its easier use. The wrapper can be installed simply with conda package manager.

Please install Bioconda if you nat have done it before.

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict

You could install virpipe conda package then.

conda create -n virpipe -c skkujin virpipe
conda activate virpipe

If you completed every step, you could test your installation.