# Useful tools for taxonomy

## `ete3`
- Python3 library and command line tool
- [ETE toolkit](http://etetoolkit.org/) (best installed with Conda)  
- [NCBITaxa tutorial](http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html) (very good IMHO)

In [32]:
from ete3 import NCBITaxa
ncbi = NCBITaxa()

In [41]:
lineage = ncbi.get_lineage(9606)  # Human taxid

In [43]:
print(lineage)  # Get lineage taxids

[1, 131567, 2759, 33154, 33208, 6072, 33213, 33511, 7711, 89593, 7742, 7776, 117570, 117571, 8287, 1338369, 32523, 32524, 40674, 32525, 9347, 1437010, 314146, 9443, 376913, 314293, 9526, 314295, 9604, 207598, 9605, 9606]


In [44]:
tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])  # Get topology for several taxids of interest

In [45]:
print(tree)


   /-7707
  |
  |         /-9606
--|      /-|
  |   /-|   \-9598
  |  |  |
   \-|   \-10090
     |
      \-8782


In [46]:
print(tree.get_ascii(attributes=["sci_name", "rank"]))  # Print more usefully labelled tree


                      /-Dendrochirotida, order
                     |
                     |                                                                /-Homo sapiens, species
-Deuterostomia, no rank                                           /Homininae, subfamily
                     |                /Euarchontoglires, superorder                   \-Pan troglodytes, species
                     |               |                           |
                      \Amniota, no rank                           \-Mus musculus, species
                                     |
                                      \-Aves, class


In [52]:
# Get all descendents of a scientific taxon name of interest
tree = ncbi.get_descendant_taxa('Homo', collapse_subspecies=True, return_tree=True)
print(tree.get_ascii(attributes=['sci_name', 'taxid']))


          /-Homo sapiens, 9606
-Homo, 9605
          \-Homo heidelbergensis, 1425170


In [55]:
# Fetch taxid of interest
apoda_translated = ncbi.get_name_translator(['Apoda'])
apoda_translated

{'Apoda': [287107]}

In [56]:
# Just taxid as an int
apoda_taxid = int(list(ncbi.get_name_translator(['Apoda'], ).values())[0][0])
apoda_taxid

287107

## EBI taxonomy REST API

[Overview](https://www.ebi.ac.uk/ena/browse/taxonomy-service)


### From scientific name

In [63]:
import requests

r = requests.get('https://www.ebi.ac.uk/ena/data/taxonomy/v1/taxon/scientific-name/Cryptobranchoidea')
r.json()

[{'division': 'VRT',
  'formalName': 'false',
  'geneticCode': '1',
  'lineage': 'Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Amphibia; Batrachia; Caudata; ',
  'mitochondrialGeneticCode': '2',
  'rank': 'superfamily',
  'scientificName': 'Cryptobranchoidea',
  'submittable': 'false',
  'taxId': '30364'}]

### From common name

In [65]:
requests.get('https://www.ebi.ac.uk/ena/data/taxonomy/v1/taxon/any-name/frogs').json()

[{'commonName': 'frogs and toads',
  'division': 'VRT',
  'formalName': 'false',
  'geneticCode': '1',
  'lineage': 'Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Amphibia; Batrachia; ',
  'mitochondrialGeneticCode': '2',
  'rank': 'order',
  'scientificName': 'Anura',
  'submittable': 'false',
  'taxId': '8342'}]