Skip to content

Python library to deal with taxonomic IDs and lineages from the NCBI's Taxdump files

License

Notifications You must be signed in to change notification settings

gregdenay/taxidTools

 
 

Repository files navigation

CD/CI PyPI - License GitHub release (latest by date) Conda Version Pypi Version Docker Image Version DOI

TaxidTools - A Python Toolkit for Taxonomy

taxidTools is a Python library to handle Taxonomy definitions.

Highlights

  • Load taxonomy defintions for the NCBI's taxdump files
  • Prune, filter, and normalize branches
  • Save as JSON for later use
  • Determine consensus, last common ancestor, or distances
  • Retrieve ancestries or list descendants

Installation

With pip:

pip install taxidtools

With conda:

conda install -c conda-forge taxidtools

With docker:

docker pull gregdenay/taxidtools

Quickstart

With the NCBI's taxdump files installed locally:

>>> import taxidTools
>>> tax = taxidTools.read_taxdump('nodes.dmp', 'rankedlineage.dmp', 'merged.dmp')
>>> tax.getName('9606')
'Homo sapiens'
>>> lineage = tax.getAncestry('9606')
>>> lineage.filter()
>>> [node.name for node in lineage]
['Homo sapiens', 'Homo', 'Hominidae', 'Primates', 'Mammalia', 'Chordata', 'Metazoa']
>>> tax.lca(['9606', '10090']).name
'Euarchontoglires'
>>> tax.distance('9606', '10090')
18

Documentation

Full documentation is hosted on the homepage

Cite us

If you use taxidTools for your reasearch, you can cite it using the DOI at the top of this page.

About

Python library to deal with taxonomic IDs and lineages from the NCBI's Taxdump files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 99.2%
  • Dockerfile 0.8%