DupLoss-2 is a program for phylogenomic species tree inference using gene tree parsimony. It takes as input a collection of gene trees and seeks a species tree that best reconciles the input gene trees under a gene duplication and loss reconciliation model. DupLoss-2 can lead to significant improvements in species tree reconstruction accuracy compared to other existing methods on phylogenomic datasets where gene duplication and loss are the primary drivers of gene family evolution. DupLoss-2 is scalable to whole-genome datasets with thousands of gene trees from hundreds of taxa. Further methodological details and experimental results appear in the paper cited below.
This repository includes complete source code, user manual, test data, and precompiled executables for macOS, Linux, and Windows. In addition, a Python script to automate execution of multiple runs of DupLoss-2 on the same dataset is available in the Executables directory as MultiRunScript.py.
Detailed instructions for compiling from source and for executing the program are available in the user manual. For reference, a copy of the user manual is available from this GitHub link: DupLoss-2 user manual
DupLoss-2 can be cited as follows:
DupLoss-2: Improved Phylogenomic Species Tree Inference under Gene Duplication and Loss
Rachel Parsons and Mukul S. Bansal
Systematic Biology; in press.