Installation
NOTE! VirPipe is intended to run on Linux and Mac OS. It is highly likely not to run on Windows.
The requirements of VirPipe are Conda, Git, Docker, and Git LFS(Only for zoonotic rank).
git clone --recurse-submodules https://github.com/KijinKims/virpipe
cd virpipe/build_image
docker build -t virpipe --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) . # can take ~1hr. recommend downloading databases while building.
cd ..
sh download_database.sh # can take several hours
conda create -n virpipe -c skkujin virpipe
conda activate virpipe
The installation of VirPipe consists of the steps as follows:
Each step only has very few commands to execute. You could make it just by following the instructions as described here.
You need to clone the VirPipe git repository to install VirPipe.
You can clone the repo by typing the commands below.
git clone --recurse-submodules https://github.com/KijinKims/virpipe
cd virpipe
If Git LFS is installed on your computer, large files used by zoonotic rank will be cloned too. (About 40MB) If you don't use zoonotic rank, you can simply omit the --recurse-submodules
option.
git clone https://github.com/KijinKims/virpipe
If you already cloned before but want to update it
git fetch
git pull origin master
Note! Docker should be installed, and the user must have permission to access Docker. You can test the installation with hello-world official image.
docker run hello-world
Try the command below if you have already installed Docker but can't access it.
newgrp docker
VirPipe consists of a docker container and a wrapper program that forwards the parameters to the nextflow scripts contained in the container. A docker image should be built, and the containers instantiated from this image will work as workspaces where every task will be done. You can build the image with the commands below. This step can take up to 1hr.
cd build_image
docker build -t virpipe --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) .
You could download ready-made databases by executing the bash script in the home folder of the cloned repository.
sh download_database.sh
This script downloads ref_viruses_rep_genomes database for BLAST, k2_viral database for Kraken2, a custom database built with refseq viral sequences for Centrifuge, and a database for taxonomizr. Last two are fetched from Zenodo repository.
However, from experience, manual download through a browser is faster and more stable. So you could do it manually via the links below if the script above doesn't work well.
The current list of databases is shown below. We are looking for the way to reduce the size of taxonomizr database by limiting the taxons to viruses.
Name | Compressed size (GB) | Decompressed size (GB) | Included taxons |
---|---|---|---|
Kraken2 | 0.4 | 0.5 | Refseq_viral |
BLAST | 0.1 | 0.1 | ref_viruses_rep_genomes |
Centrifuge | 0.1 | 0.2 | Refseq_viral |
Taxonomizr | 6.5 | 36 | All |
Total | 7.3 | 36.9 |
Even though the built docker image can be used directly, a python wrapper is provided for its easier use. The wrapper can be installed simply with conda package manager.
Please install Bioconda if you nat have done it before.
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
You could install virpipe conda package then.
conda create -n virpipe -c skkujin virpipe
conda activate virpipe
If you completed every step, you could test your installation.