# Creating a custom summary tree for a set of taxa of interest
## Upload phylogenies 

If you want to use the existing sythesis tree, instead of creating a custom tree, see https://github.com/McTavishLab/jupyter_OpenTree_tutorials/blob/master/notebooks/DEMO_OpenTree.ipynb  
Upload the trees you wnat to summarize to Phylesystem: https://tree.opentreeoflife.org/curator  
Map the tip labels to the OpenTree taxonomy using the OTU Mapping tab. Don't forget to save!  
Add your trees to a collection: https://tree.opentreeoflife.org/curator/collection/  
Rank them based on which tree's relationships you want to prioritize in your summary tree.

For this example I will be summariazing some recent drosphila trees, which I have placed in my collection 'dros'. https://tree.opentreeoflife.org/curator/collection/view/snacktavish/dros  
I have rankedthem based on how recently they were published.

I have a list of taxa that I need a tree for. It is stored in drosophila_example/DrosophilaSpecies.txt


## Running this example

Install and system setup info at:
http://opentreeoflife.github.io/SSBworkshop/


```
    git clone https://github.com/McTavishLab/jupyter_OpenTree_tutorials.git
    cd  jupyter_OpenTree_tutorials/workbooks
    jupyter notebook
```

The example data for this demo will be in `drosophila_example`.
You should create a working folder for your data and outputs.

## Standardizing taxon names

One of the key challenges of comparing trees across studies is differences in taxon names because of spelling or taxonomic idiosincracies.

A solution to this, is mapping taxon names to unique identifiers using the Open Tree Taxonomic Name Resolution Service (TNRS). There are a few options to use this service including via the API, or the browser based bulk name mapping.

### Open Tree TNRS bulk name mapping tool.

Access this tool at https://tree.opentreeoflife.org/curator/tnrs/

This is a new beta-version of this functionality, so some parts are a bit finicky.

*Try this*
  * Click on "Add names..." (second button at the top of the menu on the left), and upload the names file `drosophila_example`. The "loading file" window will not close by itself, click the (X).
  * In the "Mapping options" section (bottom of the menu to the left):
    - select 'Insects' to narrow down the possibilities and speed up mapping
  * Click "Map selected names" (middle of the menu to the left).
  * Exact matches will show up in green, and can be accepted by clicking "accept exact matches".
  * Once you have accepted names for each of the taxa, click "Save nameset...", download it to your laptop, and extract (unzip) the files. You can take a look at the human readable version of the output at `output/main.csv`. `main.json` contains the the same data in a more computer readable format.
  * Finally, transfer the `main.csv` file to your working folder, so you can use it to get the tree for your taxa.

*Make sure your mappings were saved! If you do not **accept** matches (by clicking buttons), they do not download.*


## Get the Most Recent Common Ancestor of your taxa of interest


In [19]:
from opentree import OT
import csv
mapped_names = "../drosophila_example/drosophila_main.csv"

## uses the csv to create a dictionary with OTTids as keys, and the label you input as values
with open(mapped_names) as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    next(reader, None)  # skip the headers
    label_dict = {row[2]:row[0] for row in reader}

ott_id_list = list(label_dict.keys())

## Get the taxonomic MRCA of the taxa of interest
output = OT.taxon_mrca(ott_ids=ott_id_list)
print(output.response_dict)


{'mrca': {'flags': [], 'is_suppressed': False, 'is_suppressed_from_synth': False, 'name': 'Drosophilidae', 'ott_id': 34905, 'rank': 'family', 'source': 'ott3.3draft1', 'synonyms': [], 'tax_sources': ['ncbi:7214', 'worms:987176', 'gbif:5547', 'irmng:100842'], 'unique_name': 'Drosophilidae'}}


For my drosophila example I will set the root of my custom synth tree to 'Drosophilidae', 'ott_id': 34905

## Run custom synth on your trees

In [30]:
!curl -X POST --insecure https://ot38.opentreeoflife.org/v3/tree_of_life/build_tree -d '{"input_collection":"snacktavish/dros", "root_id": "ott34905"}'

{"opentree_home": "/home/deploy/synthesis", "ott_dir": "/home/deploy/synthesis/ott/ott3.3", "root_ott_id": "34905", "synth_id": "snacktavish_dros_34905_tmp2_qoc6hj", "collections": "snacktavish/dros", "cleaning_flags": "major_rank_conflict,major_rank_conflict_inherited,environmental,viral,barren,not_otu,hidden,was_container,inconsistent,hybrid,merged", "additional_regrafting_flags": "extinct_inherited,extinct", "queue_order": 19, "status": "QUEUED"}

In [31]:
!curl -X GET https://ot38.opentreeoflife.org/v3/tree_of_life/list_custom_built_trees | jq


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 10155  100 10155    0     0  21642      0 --:--:-- --:--:-- --:--:-- 21606
[1;39m{
  [0m[34;1m"snacktavish_aves_81461_tmp520utw8e"[0m[1;39m: [0m[1;39m{
    [0m[34;1m"opentree_home"[0m[1;39m: [0m[0;32m"/home/deploy/synthesis"[0m[1;39m,
    [0m[34;1m"ott_dir"[0m[1;39m: [0m[0;32m"/home/deploy/synthesis/ott/ott3.3"[0m[1;39m,
    [0m[34;1m"root_ott_id"[0m[1;39m: [0m[0;32m"81461"[0m[1;39m,
    [0m[34;1m"synth_id"[0m[1;39m: [0m[0;32m"snacktavish_aves_81461_tmp520utw8e"[0m[1;39m,
    [0m[34;1m"collections"[0m[1;39m: [0m[0;32m"snacktavish/aves"[0m[1;39m,
    [0m[34;1m"cleaning_flags"[0m[1;39m: [0m[0;32m"major_rank_conflict,major_rank_conflict_inherited,environmental,viral,barren,not_otu,hidden,was_container,inconsistent,hybrid,merged"[0m[1;39m,
    [0m[34;1m"additional_regrafting

Find your tree, and dwonload it using GET

```curl -X GET https://ot38.opentreeoflife.org/v3/tree_of_life/custom_built_tree/YOUR_SYNTH_ID.tar.gz --output custom_synth.tar.gz```

I like to rename my tar files to something I can remember using --output

In [34]:
!wget https://ot38.opentreeoflife.org/v3/tree_of_life/custom_built_tree/snacktavish_dros_34905_tmprbneub2i.tar.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1631k  100 1631k    0     0  1360k      0  0:00:01  0:00:01 --:--:-- 1359k


In [38]:
!tar -xzvf snacktavish_dros_34905_tmprbneub2i.tar.gz

snacktavish_dros_34905_tmprbneub2i/
snacktavish_dros_34905_tmprbneub2i/.STATUS.txt
snacktavish_dros_34905_tmprbneub2i/assessments/
snacktavish_dros_34905_tmprbneub2i/assessments/lost_taxa.txt
snacktavish_dros_34905_tmprbneub2i/assessments/taxonomy_degree_distribution.txt
snacktavish_dros_34905_tmprbneub2i/assessments/README.md
snacktavish_dros_34905_tmprbneub2i/assessments/index.json
snacktavish_dros_34905_tmprbneub2i/assessments/supertree_degree_distribution.txt
snacktavish_dros_34905_tmprbneub2i/assessments/summary.json
snacktavish_dros_34905_tmprbneub2i/assessments/index.html
snacktavish_dros_34905_tmprbneub2i/grafted_solution/
snacktavish_dros_34905_tmprbneub2i/grafted_solution/grafted_solution_ottnames.tre
snacktavish_dros_34905_tmprbneub2i/grafted_solution/README.md
snacktavish_dros_34905_tmprbneub2i/grafted_solution/grafted_solution.tre
snacktavish_dros_34905_tmprbneub2i/grafted_solution/index.json
snacktavish_dros_34905_tmprbneub2i/grafted_solution/index.html
sn

snacktavish_dros_34905_tmprbneub2i/.snakemake/log/2022-12-16T094226.522428.snakemake.log
snacktavish_dros_34905_tmprbneub2i/.snakemake/log/2022-12-16T094226.534469.snakemake.log
snacktavish_dros_34905_tmprbneub2i/.snakemake/log/2022-12-16T094226.396331.snakemake.log
snacktavish_dros_34905_tmprbneub2i/.snakemake/log/2022-12-16T094226.516110.snakemake.log
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott63105.tre
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott930774.tre
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott73057.tre
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott682726.tre
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott1035516.tre
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/flag.txt
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott798629.tre
snacktavish_dros_34905_tmprbneub2i/reversed_subproblems/ott387779.tre
snacktavish_dros_349

In [44]:
import dendropy
treepath = "snacktavish_dros_34905_tmprbneub2i/labelled_supertree/labelled_supertree.tre"
custom_synth = dendropy.Tree.get_from_path(treepath, schema = "newick")