Python library for calculating the delta score (Holland et al. 2002) and Q-Residual (Gray et al. 2010)
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
paper
phylogemetric
.coveragerc
.gitignore
.travis.yml
CONTRIBUTING.md
DESCRIPTION.md
LICENSE
MANIFEST.in
Makefile
README.md
codemeta.json
requirements.txt
setup.cfg
setup.py

README.md

phylogemetric

A python library for calculating delta score (Holland et al. 2002) and Q-Residual (Gray et al. 2010) for phylogenetic data.

Build Status Coverage Status DOI License JOSS

Installation:

Installation is only a pip install away:

pip install phylogemetric

Usage: Command line

Basic usage:

> phylogemetric

usage: phylogemetric [-h] method filename

Calculate delta score for filename example.nex:

> phylogemetric delta example.nex

taxon1              0.2453
taxon2              0.2404
taxon3              0.2954
...

Calculate qresidual score for filename example.nex:

> phylogemetric qresidual example.nex

taxon1              0.0030
taxon2              0.0037
taxon3              0.0063
...

Note: to save the results to a file use shell piping e.g.:

> phylogemetric qresidual example.nex > qresidual.txt

Usage: Library

Calculate scores:

from nexus import NexusReader
from phylogemetric import DeltaScoreMetric
from phylogemetric import QResidualMetric

# load data from a nexus file:
nex = NexusReader("filename.nex")
qres = QResidualMetric(nex.data.matrix)

# Or construct a data matrix directly: 

matrix = {
    'A': [
        '1', '1', '1', '1', '0', '0', '1', '1', '1', '0', '1', '1',
        '1', '1', '0', '0', '1', '1', '1', '0'
    ],
    'B': [
        '1', '1', '1', '1', '0', '0', '0', '1', '1', '1', '1', '1',
        '1', '1', '1', '0', '0', '1', '1', '1'
    ],
    'C': [
        '1', '1', '1', '1', '1', '1', '1', '0', '1', '1', '1', '0',
        '0', '0', '0', '1', '0', '1', '1', '1'
    ],
    'D': [
        '1', '0', '0', '0', '0', '1', '0', '1', '1', '1', '1', '0',
        '0', '0', '0', '1', '0', '1', '1', '1'
    ],
    'E': [
        '1', '0', '0', '0', '0', '1', '0', '1', '0', '1', '1', '0',
        '0', '0', '0', '1', '1', '1', '1', '1'
    ],
}

delta = DeltaScoreMetric(matrix)

Class Methods:

m = DeltaScoreMetric(matrix)

# calculates the number of quartets in the data:
m.nquartets()

# returns the distance between two sequences:
m.dist(['1', '1', '0'], ['0', '1', '0'])

# gets a dictionary of metric scores:
m.score()

# pretty prints the metric scores:
m.pprint()

Requirements:

  • python-nexus >= 1.1

Performance Notes:

Currently phylogemetric is implemented in python, and the Delta/Q-Residual algorithms are O(n). This means that performance is not optimal, and it may take a while to calculate these metrics for datasets with more than 100 taxa or so.

I hope to improve performance in the near future, but in the meantime, if this is an issue for you then try using the implementations available in SplitsTree.

Citation:

If you use phylogemetric, please cite:

Greenhill, SJ. 2016. Phylogemetric: A Python library for calculating phylogenetic network metrics. Journal of Open Source Software.
http://dx.doi.org/10.21105/joss.00028

Acknowledgements:

  • Thanks to David Bryant for clarifying the Q-Residual code.
  • Thanks to Kristian Rother for code quality suggestions.