Skip to content

A comprehensive bacterial core gene-set annotation pipeline based on Roary and pairwise ILPs

License

Notifications You must be signed in to change notification settings

MarieLataretu/ribap

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RIBAP

Roary ILP Bacterial Annotation Pipeline

Twitter Follow Twitter Follow

This tool is currently under heavy development, so expect some bugs but feel free to report issues

Annotate your protein sequences with Prokka and determine a pan genome with Roary. This genome is refined with the usage of ILPs that solve the best matching for each pairwise strain mmseqs2 comparison.

What is this about?

A common task when you have a bunch of bacterial genomes in your hands is the calculation of a core gene set. So, we want to know, which genes are homologous and shared between certain bacteria. However, defining homology only based an sequence similarity often underestimates the true core gene set, in particular when diverse species are compared. RIBAP combines sequence homology information from Roary with smart pairwise ILP calculations to produce a more complete core gene set - even on genus level. First, RIBAP performs annotations with Prokka, calculates the core gene set using Roary and pairwise ILPs, and finally visualizes the results in an interactive HTML table garnished with protein multiple sequence alignments and trees. RIBAP comes with Nextflow and Docker/Conda support for easy execution.

How can I give it a try?

Easy, you just need a working nextflow and docker or conda installation, see below! You have nextflow and docker? Give it a try:

nextflow run hoelzer-lab/ribap --fasta "$HOME/.nextflow/assets/hoelzer-lab/ribap/data/*.fasta"

You have nextflow and conda? Okay:

nextflow run hoelzer-lab/ribap --fasta "$HOME/.nextflow/assets/hoelzer-lab/ribap/data/*.fasta" -profile conda

You need some of this dependencies? See below.

Installation

  • runs with the workflow manager nextflow using docker or conda
  • this means all programs are automatically pulled via docker or conda
  • only docker or conda and nextflow need to be installed (per default docker is used)

Nextflow

Needed in both cases (conda, docker)

sudo apt-get update
sudo apt install -y default-jre
curl -s https://get.nextflow.io | bash 
sudo mv nextflow /bin/

Using Conda

Just copy the commands and follow the installation instructions. Let the installer configure conda for you. You need to specify -profile conde to run the pipeline with conda support.

cd
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

See here if you need a different installer besides Linux used above.

Using Docker

Easy

If you dont have experience with bioinformatic tools just copy the commands into your terminal to set everything up:

sudo apt-get install -y docker-ce docker-ce-cli containerd.io
sudo usermod -a -G docker $USER
  • restart your computer
  • try out the installation by entering the following

Experienced

Dependencies

  • docker (add docker to your Usergroup, so no sudo is needed)
  • nextflow + java runtime
  • git (should be already installed)
  • wget (should be already installed)
  • tar (should be already installed)
  • Docker installation here
  • Nextflow installation here
  • move or add the nextflow executable to a bin path
  • add docker to your User group via sudo usermod -a -G docker $USER

Execution examples

Get or update the workflow:

nextflow pull hoelzer-lab/ribap

Get help:

nextflow run hoelzer-lab/ribap --help

Run with RAxML tree calculation and specified output dir:

nextflow run hoelzer-lab/ribap --fasta '*.fasta' --tree --outdir ~/ribap

Flowchart

chart

About

A comprehensive bacterial core gene-set annotation pipeline based on Roary and pairwise ILPs

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 71.2%
  • Nextflow 16.3%
  • Shell 12.5%