## Notebook to ensure tip order is correct

#### Concerns: 
    
1. Are the names being swapped correctly when changing to msprime naming (1-ntips) and then back to the names on the species tree (toytree)?
2. Is the sequence array being ordered correctly given the names applied to the trees? 
3. For balanced splits (cherries) is arbitrary ordering of the tree consequential?
   
    
#### How this is demonstrated in this notebook:  
1. We show that the relationships in the species tree match those in the gene trees, and this is made even more clear by setting different Ne values on different branches so that their coalescent times clearly vary. 
2. We show that the inferred gene tree for a locus matches the true genealogy, and this is made even more clear by setting different Ne values on different branches so that their coalescent times clearly vary. 
3. Again, by setting different Ne values on different branches it is clear that the topology and coalescent times are being used correctly. We show this by testing on both balanced and imbalanced species trees. 

In [2]:
import ipcoal
import toytree
import toyplot
import numpy as np

### 1. Name ordering

We can see clearly in this example that the name ordering is working correctly. Here I sample three individuals from each population in a tree with 8 populations. I then set the Ne to be large or small for one of each pair of sister species. We can see that the odd-numbered tips have large Ne, and the even-numbered tips have small Ne and thus short coalescent times. The relationships and coalescent times appear as expected. 

#### imbalanced tree

This looks correct.

In [3]:
# load a species tree
tree = toytree.rtree.imbtree(8, treeheight=1e6)

# set Ne values to high for odd number tip names
tree = tree.set_node_values(
    attr="Ne",
    default=1e3,
    values={7: 2e5, 5:2e5, 3:2e5, 0:2e5},
)

# load the ipcoal model and sample trees
model = ipcoal.Model(tree, samples=3, seed=333)
model.sim_loci(3);

# plot the species tree and a sampled genealogy side by side
canvas = toyplot.Canvas(width=700, height=400)
ax0 = canvas.cartesian(bounds=("10%", "45%", "10%", "90%"))
ax1 = canvas.cartesian(bounds=("55%", "90%", "10%", "90%"))
ax0.show = False; ax1.show = False

# draw species tree
tre = toytree.tree(model.tree)
tre.draw(
    axes=ax0, 
    node_labels=tre.get_node_values("idx", 1, 1),
    edge_widths="Ne",
);

# draw genealogy
tre = toytree.tree(model.df.genealogy[0])
tre.draw(axes=ax1);

#### balanced tree

In [11]:
# load a species tree
tree = toytree.rtree.baltree(8, treeheight=1e6)

# set Ne values to high for odd number tip names
tree = tree.set_node_values(
    attr="Ne",
    default=1e3,
    values={7: 2e5, 5:2e5, 3:2e5, 1:2e5},
)

# load the ipcoal model and sample trees
model = ipcoal.Model(tree, samples=3, seed=333)
model.sim_loci(3);

# plot the species tree and a sampled genealogy side by side
canvas = toyplot.Canvas(width=700, height=400)
ax0 = canvas.cartesian(bounds=("10%", "45%", "10%", "90%"))
ax1 = canvas.cartesian(bounds=("55%", "90%", "10%", "90%"))
ax0.show = False; ax1.show = False

# draw species tree
tre = toytree.tree(model.tree)
tre.draw(
    axes=ax0, 
    node_labels=tre.get_node_values("idx", 1, 1),
    edge_widths="Ne",
);

# draw genealogy
tre = toytree.tree(model.df.genealogy[0])
tre.draw(axes=ax1);

### (2) Are the sequences ordered correctly?

This looks like no. And I'm starting to worry that the there is an error either in rooting, or in evolving data along an edge near the root, since this edge length is basically missing...

#### imbalanced tree

The edge lengths look very good, but the tip names are wrong.

In [15]:
# load a species tree
tree = toytree.rtree.imbtree(8, treeheight=1e6)

# set Ne values on tips
tree = tree.set_node_values(
    attr="Ne",
    default=1e3,
    values={7: 2e5, 5:2e5, 3:2e5, 0:2e5},
)

# load the ipcoal model and sample trees
model = ipcoal.Model(tree, samples=3, seed=123, recomb=0)
model.sim_loci(1, 100000)
model.infer_gene_trees()

# plot the species tree and a sampled genealogy side by side
canvas = toyplot.Canvas(width=700, height=400)
ax0 = canvas.cartesian(bounds=("10%", "45%", "10%", "90%"))
ax1 = canvas.cartesian(bounds=("55%", "90%", "10%", "90%"))
ax0.show = False; ax1.show = False

# draw genealogy
toytree.tree(model.df.genealogy[0]).draw(axes=ax0);

# draw inferred tree
tre = toytree.tree(model.df.inferred_tree[0])
rtre = tre.root(wildcard="r7")
rtre.draw(axes=ax1);

#### balanced tree

Rooting edge length problem here needs fixed in toytree... 

In [16]:
# load a species tree
tree = toytree.rtree.baltree(8, treeheight=1e6)

# set Ne values on tips
tree = tree.set_node_values(
    attr="Ne",
    default=1e3,
    values={7: 2e5, 5:2e5, 3:2e5, 1:2e5},
)

# load the ipcoal model and sample trees
model = ipcoal.Model(tree, samples=3, seed=333, recomb=0)
model.sim_loci(1, 100000)
model.infer_gene_trees()

# plot the species tree and a sampled genealogy side by side
canvas = toyplot.Canvas(width=700, height=400)
ax0 = canvas.cartesian(bounds=("10%", "45%", "10%", "90%"))
ax1 = canvas.cartesian(bounds=("55%", "90%", "10%", "90%"))
ax0.show = False; ax1.show = False

# draw genealogy
toytree.tree(model.df.genealogy[0]).draw(axes=ax0);

# draw inferred tree
tre = toytree.tree(model.df.inferred_tree[0])
rtre = tre.root(regex="r[4-7].")
rtre.draw(axes=ax1);