Skip to content


Repository files navigation


This pipeline takes as input a directory of genomes in complete or draft format (one file = one genome) and outputs a phylogeny of those genomes based on single copy ribosomal proteins. Ribosomal proteins are identified using hmmsearch and an alignment of ribosomal proteins from Yutin et al., 2012, aligned using mafft, and used to build a phylogeny with raxml. The basis for this pipeline is the method detailed in Hehemann, et al., 2016. It utilizes snakemake for ease of use and reproducibility.

##How to run

Dependencies of this pipeline are handled by conda. You can download miniconda (for python 3.5 or higher please!) here.

After installing miniconda:

  1. Modify the RiboTree_config.yml file to reflect appropriate filepaths, parameters, and filenames. All parameters are described in comments within the RiboTree_config.yml file.

1b. If running on a cluster that uses slurm as a job scheduler, modify the RiboTree_cluster_params.yml file to match your cluster setup.

2a. If running on a single machine, simply run the shell script.

2b. If running on a cluster that uses slurm as a job scheduler, modify the RiboTree_run_on_slurm.sbatch jobscript as appropriate and submit to the job scheduler.


Not tested on Windows or OSX machines. I suspect that at present the code will break in windows due to the use of \ as opposed to / in filepaths on Windows systems. May work in OSX.


Pipeline to make a phylogeny from ribosomal proteins pulled from microbial genomes







No releases published


No packages published