-
Notifications
You must be signed in to change notification settings - Fork 7
A FAQ like Introduction to RogueNaRok
RogueNaRok is an algorithm for the identification of rogue taxa in a tree set.
Download the code and run the "make" command.
For a parallel version of the RogueNaRok algorithm, use "make mode=parallel". Note that, the parallel version requires the pthreads-library. Running one of the programs without arguments will trigger the help message.
Rogue taxa are wandering taxa, that assume plenty phylogenetic positions in a set of bootstrap (or Bayesian sampled) trees. Usually ambivalent or insufficient phylogenetic signal is the reason for this phenomenon. Thereby, they decrease resolution and/or support in the consensus tree. Removing (resp. pruning) them from a tree set may produce a more informative consensus tree.
The input needs to be a set of fully bifurcating unrooted trees in Newick format contained in one single tree file.
This is important to emphasize: RogueNaRok identifies rogue taxa that fit the definition above. Every taxon that has a detrimental effect on the support in a consensus tree is considered a rogue taxon and therefore can be detected by RogueNaRok based on the bootstrap trees. Other classes of nasty taxa that produce effects like long branch attraction or problems with convergent evolution will not be detected by RogueNaRok, if they do not have a negative effect on the support.
- Optimize either support or resolution of the resulting pruned consensus tree.
- Optimize with respect to a minimum frequency threshold for the bipartitions in the pruned consensus tree ranging between 50% (majority consensus, our default) to 100% (strict consensus). Alternatively, you can optimize the majority rule extended consensus (MRE) tree or the bipartition support of a tree collection drawn on a maximum likelihood estimate tree.
- Explicitly forbid to consider certain taxa for pruning as rogue taxa.
- Dropset size: specify the number of taxa simultaneously considered for pruning in each iteration of the algorithm. The default is 1, since this is a particularly expensive operation. However, if you specify the dropset size as "number of taxa" - 1, then the algorithm will find the most informative (resp. optimal) pruned consensus tree with respect to the given parameters.
- Ready for multi-core machines: expensive phases of the algorithm can be executed in parallel on shared-memory machines. Advantageous for dropset sizes > 1, MRE tree optimization and data sets with more than 2,000 taxa.
- The RogueNaRok algorithm as implemented here, was accepted in Systematic Biology:
- It is an improved version of an algorithm published on IEEE BIBM 2011:
- The rogue taxon identification algorithm implemented in RAxML is published in TCBB:
- RogueNaRok: our algorithm with various variations
- rnr-tii: the taxonomic instability index as known from Mesquite by Maddison.
- rnr-lsi: the leaf stability index (all three measures for UNROOTED trees) by Thorley.
- rnr-prune: a simple program to prune a tree collection and/or a single ML tree
- rnr-mast: computes a unrooted maximum agreement subtree of a tree set.
The name is an allusion to Ragnarok, the twilight/doom of the Norse gods. In mythology, a renewed and fertile world emerges from the catastrophe. This aspect reflects our hope that phylogenies pruned from rogues as suggested by our algorithm are more informative than before.
Some functions (and even more important a lot of implementation concepts) are derived or included from RAxML, a phylogenetic tree inference software under Maximum Likelihood by Alexandros Stamatakis.