# How do you get a tree for a list of species names?
## Standardizing taxon names

One of the key challenges of comparing trees across studies is differences in taxon names because of spelling or taxonomic idiosincracies.

A solution to this, is mapping taxon names to unique identifiers using the Open Tree Taxonomic Name Resolution Service (TNRS). There are a few options to use this service including via teh API, or the browser based bulk name mapping. The names of the taxa you will search for this tutorial were copied from https://en.wikipedia.org/wiki/List_of_birds_of_Georgia_(U.S._state) the folder named 'GA_waterfowl.txt'.

### Open Tree TNRS bulk upload tool.

Access this tool at https://tree.opentreeoflife.org/curator/tnrs/

This is a brand new beta-version of this functionality, so some parts are a bit finicky.

*Try this*
  * Click on "Add names..." (second button at the top of the menu on the left), and upload the names file `tutorial/GA_waterfowl.txt`. The "loading file" window will not close by itself, click the (X).
  * In the "Mapping options" section (bottom of the menu to the left):
    - select 'Birds' to narrow down the possibilities and speed up mapping
  * Click "Map selected names" (middle of the menu to the left).
  * Exact matches will show up in green, and can be accepted by clicking "accept exact matches".
  * Once you have accepted names for each of the taxa, click "Save nameset...", download it to your laptop, and extract (unzip) the files. You can take a look at the human readable version of the output at `output/main.csv`. `main.json` contains the the same data in a more computer readable format.
  * Finally, transfer the `main.csv` file to the tutorial folder, so you can use it to get the tree for your taxa.

*Make sure your mappings were saved! If you do not **accept** matches (by clicking buttons), they do not download.*


Detailed mapping demo notes:

1. Map and accept exact matches
2. Click through a few synonyms. We actually treat the older name *Anas clypeata* as the 'canonical' name, based on NCBI taxonomy, but recognize the synony of *Spatula clypeata*  (2009)(newer, due to non-monophyly of Anas).  Other synonyms include shift to match Latin grammar in *Nomonyx dominicus*. Accept synonyms. 
3. *Anas crecca carolinensis* synonymous with *Anas carolinensis* https://en.wikipedia.org/wiki/Green-winged_teal
4. You should be left with 7 spp with parantheticals after them. Use "Mapping options" to "Remove last of multiple words", and accept exat matches
5. TNRS suggests *Branta huchinsii* for *Branta huchinsonii* That seems a bit far! Go back to the original source to decide whether to accept match. https://en.wikipedia.org/wiki/List_of_birds_of_Georgia_(U.S._state), https://en.wikipedia.org/wiki/Cackling_goose
6. Use the edit function to fill in *Anas crecca crecca* for *A. c. crecca* and map.


In [2]:
!head ../tutorial/main.csv

ORIGINAL LABEL,OTT TAXON NAME,OTT TAXON ID,TAXONOMIC SOURCES
Dendrocygna autumnalis,Dendrocygna autumnalis,662608,ncbi:8873;worms:422567;gbif:2498393;irmng:10189259
Dendrocygna bicolor,Dendrocygna bicolor,662618,ncbi:8874;worms:212674;gbif:2498402;irmng:10189260
Anser caerulescens,Anser caerulescens,190878,ncbi:8849;worms:159204;gbif:2498165;irmng:11167814;irmng:10195594
Anser rossii,Anser rossii,767830,ncbi:56281;worms:159086;irmng:10824526;irmng:10189256;irmng:10189258
Anser albifrons,Anser albifrons,430239,ncbi:50365;worms:159159;gbif:2498017;irmng:10587752
Branta bernicla (A),Branta bernicla,135287,ncbi:184712;worms:159175;gbif:5232446;irmng:11345921
Branta hutchinsonii (A),Branta hutchinsii,972531,ncbi:371860;worms:422569;gbif:5232452;irmng:11831705
Branta canadensis,Branta canadensis,714461,ncbi:8853;worms:159176;gbif:5232437;irmng:10585055
Cygnus columbianus (A),Cygnus columbianus,207360,ncbi:110926;worms:159088;gbif:2498338;irmng:11267548;irmng:10997958


In [3]:
from opentree import OT

fi = open("../tutorial/main.csv").readlines()

ott_ids = set()

for lin in fi[1:]: #skip the header
    lii = lin.split(',')#split on commas
    ott_id = int(lii[2])#grab the opentree id
    ott_ids.add(ott_id)#add to the set


treefile = "GA_waterfowl.tre"
#Get the synthetic tree from OpenTree
output = OT.synth_induced_tree(ott_ids=list(ott_ids),  label_format='name')
output.tree.write(path = treefile, schema = "newick")
output.tree.print_plot(width=100)


                                                                      /--- Aythya americana         
                                                                  /---+                             
                                                              /---+   \--- Aythya collaris          
                                                              |   |                                 
                      /----------+---+--+---+---+--+---+---+--+   \---+--- Aythya valisineria       
                      |                                       |                                     
                      |                                       |       /--- Aythya affinis           
                      |                                       \---+---+                             
                      |                                               \--- Aythya marila            
                      |                                                                    

In [4]:
output.tree.as_string(schema="newick")


'((((((((((((((((((Aythya_americana,Aythya_collaris)mrcaott693332ott1044563,(Aythya_valisineria)mrcaott817989ott832938)mrcaott693332ott817989,((Aythya_affinis,Aythya_marila)mrcaott1044556ott1044561)mrcaott817992ott1044556)mrcaott693332ott817992)Aythya)mrcaott693332ott857019)mrcaott693332ott693335)mrcaott30843ott693332)mrcaott30843ott423364)mrcaott30843ott552052)mrcaott30843ott140301)mrcaott30843ott88385,((((((((Anas_discors)mrcaott82415ott911468)mrcaott82415ott206533,((Anas_clypeata)mrcaott206534ott656794)mrcaott206534ott604189)mrcaott82415ott206534)mrcaott82415ott604173)mrcaott82415ott82424)mrcaott82415ott1086566)mrcaott30845ott82415,((((((((((((Anas_rubripes)mrcaott82410ott604178,Anas_fulvigula)mrcaott82410ott82422,Anas_platyrhynchos)mrcaott82410ott190881)mrcaott82410ott604182)mrcaott82410ott604175)mrcaott82410ott339494)mrcaott30850ott82410)mrcaott30850ott604172)mrcaott30850ott30855)mrcaott30850ott30858,(((((Anas_acuta)mrcaott855474ott911477)mrcaott82414ott855474,Anas_bahamensis)mrca

In [3]:
# We can also get the citations for the studies used to build this tree
studies = output.response_dict['supporting_studies']
cites = OT.get_citations(studies)
print(cites)

https://tree.opentreeoflife.org/curator/study/view/ot_521?tab=trees&tree=tree1
Burleigh, J. Gordon, Rebecca T. Kimball, Edward L. Braun. 2015. Building the avian tree of life using a large-scale, sparse supermatrix. Molecular Phylogenetics and Evolution 84: 53-63
http://dx.doi.org/10.1016/j.ympev.2014.12.003

https://tree.opentreeoflife.org/curator/study/view/ot_809?tab=trees&tree=tree2
Jetz, W., G. H. Thomas, J. B. Joy, K. Hartmann, A. O. Mooers. 2012. The global diversity of birds in space and time. Nature 491 (7424): 444-448
http://dx.doi.org/10.1038/nature11631

https://tree.opentreeoflife.org/curator/study/view/ot_531?tab=trees&tree=tree1
Claramunt, Santiago, Joel Cracraft. 2015. A new time tree reveals Earth historys imprint on the evolution of modern birds. Science Advances 1 (11): e1501005-e1501005
http://dx.doi.org/10.1126/sciadv.1501005

https://tree.opentreeoflife.org/curator/study/view/pg_420?tab=trees&tree=tree522
Hackett, S. J., R. T. Kimball, S. Reddy, R. C. K. Bowie, E.

# What studies in Open Tree contain amphibians?

In [9]:
ott_id = OT.get_ottid_from_name("amphibia") # Get the OttID for amphibia

trees = OT.find_trees(ott_id, search_property = 'ot:ottId') #Search through trees for amphibians

amph_studies = set()
for match in trees.response_dict['matched_studies']:
    amph_studies.add(match['ot:studyId'])

print(OT.get_citations(amph_studies))

https://tree.opentreeoflife.org/curator/study/view/pg_438
Goloboff, Pablo A., Santiago A. Catalano, J. Marcos Mirande, Claudia A. Szumik,
J. Salvador Arias, Mari Kallersjo, and James S. Farris. 2009. Phylogenetic analysis of 73 060 taxa corroborates major eukaryotic groups. Cladistics 25 (3): 211–230.
http://dx.doi.org/10.1111/j.1096-0031.2009.00255.x

https://tree.opentreeoflife.org/curator/study/view/pg_2506
Boussau B., Brown J.M., & Fujita M.K. 2011. Nonadaptive Evolution of Mitochondrial Genome Size. Evolution, .
http://dx.doi.org/10.1111/j.1558-5646.2011.01322.x

https://tree.opentreeoflife.org/curator/study/view/pg_1428
Meredith, R.W., Janecka J., Gatesy J., Ryder O.A., Fisher C., Teeling E., Goodbla A., Eizirik E., Simao T., Stadler T., Rabosky D., Honeycutt R., Flynn J., Ingram C., Steiner C., Williams T., Robinson T., Herrick A., Westerman M., Ayoub N., Springer M., & Murphy W. 2011. Impacts of the Cretaceous Terrestrial Revolution and KPg Extinction on Mammal Diversification

# What publications are being used in the synthesis tree for amphibians?

In [1]:
OT.get_subtree()

NameError: name 'OT' is not defined

# DIY! Find a list of species somewhere. Map their ott ids