forked from sheneman/clearcut
-
Notifications
You must be signed in to change notification settings - Fork 1
This is clearcut, the reference implementation for the Relaxed Neighbor Joining algorithm.
License
ibest/clearcut
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
$Id: README,v 1.3 2006/09/01 04:55:39 sheneman Exp $ ****************************************************************************** Clearcut :: Relaxed Neighbor Joining (Version 1.0.9, Feb. 2009) ****************************************************************************** INTRODUCTION: ------------- Clearcut is the reference implementation for the Relaxed Neighbor Joining (RNJ) algorithm by J. Evans, L. Sheneman, and J. Foster from the Initiative for Bioinformatics and Evolutionary Studies (IBEST) at the University of Idaho. Details of RNJ are published here: Evans, J., L. Sheneman, and J.A. Foster (2006) Relaxed Neighbor-Joining: A Fast Distance-Based Phylogenetic Tree Construction Method, J. Mol. Evol., 62, 785-792 Relaxed Neighbor-Joining (RNJ) is a fast approximation to the Neighbor Joining algorithm originally described in: Saitou, N. and M. Nei (1987), The Neighbor-Joining method: A new method for reconstructing phylogenetic trees., Mol. Biol. Evol. 4:406-425 and revised in: Studier, J. and K. Keppler (1988), A note on the neighbor-joining algorithm of Saitou and Nei., Mol. Biol. Evol. 5:729-731 Whereas traditional Neighbor-Joining has a cubic time complexity with respect to the number of input sequences, Relaxed Neighbor-Joining has a drastically reduced, sub-cubic time complexity for the average case. In addition to being significantly (and asymptotically) more efficient, Relaxed Neighbor-Joining shares some nice theoretical properties with Traditional Neighbor-Joining. In particular, if distances are truly additive (self-consistent), RNJ will reconstruct the true tree that is consistent with those additive distances. For non-additive distances, RNJ will sample the space of similar trees more thoroughly than traditional NJ by greedily joining nodes which represent a locally (as opposed to globally) minimum distance between nodes. This results in sampling more trees than it is possible to explore through NJ alone. Additionally, it has been empirically shown that for non-additive distances, RNJ is capable of reconstructing trees which are nearly qualitatively indistinct from trees built via the conventional Neighbor-Joining algorithm. ****************************************************************************** INSTALLING CLEARCUT: -------------------- Clearcut was developed and tested primarily under Redhat Linux 7.2 and Redhat Enterprise Linux 3 on the Pentium 3 and Pentium 4 architectures. However, Clearcut should compile and run easily on other UNIX and UNIX-like operating systems. For example, clearcut builds cleanly on the Sun Solaris 9 and Apple OS X platforms. Good compiler optimization can result in an approximate 2X overall speedup for Clearcut. Compiler optimizations for GCC under Linux on the Pentium 4 architecture have been thoroughly explored. The Makefile included in this distribution has several possible sets of compiler flags available. Uncomment the appropriate set of compiler flags for your particular architecture and compiler combination. The default CFLAGS is set to basic level 3 optimization with gcc, but significant additional compiler optimizations are most likely available/desirable. To build Clearcut: + Unzip and extract the distribution. + Edit "Makefile" and select the appropriate optimization flags + Type "make" to compile and link clearcut + Type "make install" to install clearcut on your system ****************************************************************************** RUNNING CLEARCUT: ----------------- Clearcut has a variety of possible command-line arguments. To see the available arguments, type: $> clearcut --help Clearcut is capable of reading either a pre-computed distance matrix in approximate PHYLIP format, or it can input an alignment in FASTA format. In addition to fully symmetric distance matrices, clearcut can parse upper and lower diagonal half-matrices. Clearcut can build a distance matrix from a Multiple Sequence Alignment (MSA) of either DNA or Protein sequences. When building a distance matrix, percent pairwise differences can be corrected for multiple hits using either the Jukes-Cantor or Kimura correction methods as described in: Kimura, M. (1980), A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol., 16, 111-120 Kimura, M. (1983), The Neutral Theory of Molecular Evolution. p. 75., Cambridge University Press, Cambridge, England Jukes, T.H. (1969), Evolution of protein molecules. In H.N. Munro (Ed.), Mammalian Protein Metabolism, Volume III, Chapter 24, pp. 21-132. New York: Academic Press By default, Clearcut will perform joins in a random fashion in order to minimize systematic algorithmic bias. This option can be disabled using the --norandom runtime option. By default, Clearcut uses Relaxed Neighbor-Joining, although Traditional Neighbor-Joining can be invoked with the --neighbor runtime option. ****************************************************************************** AUTHOR/MAINTAINER: ------------------ Clearcut is maintained by: Luke Sheneman sheneman@hungry.com https://github.com/sheneman/clearcut Please contact the maintainer with questions, bug reports, and feedback!! ****************************************************************************** LICENSE: -------- Clearcut is distributed under the BSD license, which is described in the LICENSE file that is bundled with this program. ******************************************************************************
About
This is clearcut, the reference implementation for the Relaxed Neighbor Joining algorithm.
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- C 97.9%
- Makefile 2.1%