# Cookbook

## Visualizing incomplete lineage sorting

The fundamental parameter of the coalescent is the effective population size (Ne). This determines the expected waiting time until N samples in a population coalesce to a common ancestor. In **ipcoal** we are interested in a *structured coalescent*, where a population tree or network enforces a hierarchical model where individuals from distinct populations cannot coalesce with each other until the populations merge back in time (unless there is migration). 

In **ipcoal** you can easily set up simulations on empiciral trees or networks with variable Ne values across different edges. This is demonstrated here, as well as some visualization tools for verifying expected results. 

### Getting Started

Imports

In [2]:
import ipcoal
import toytree
import numpy as np

This tree drawing style that will be used in following plots.

In [3]:
# drawing style
tstyle = {
    'layout': 'd',
    'edge_type': 'c',
    'tip_labels': True,
    'node_labels': False,
    'node_sizes': 8,
    'node_style': {
        "stroke": "#262626",
        "stroke-width": 1,
    },
    'scalebar': True,
}

### Simulate low ILS
We can start with the simplest simulation in which `Ne` is set to a very low value such that samples within populations coalesce very rapidly, and before any population divergence events. We do not expect to obseve any ILS in this case -- genealogies should match the species tree. 

In [4]:
# generate a random species tree
tre = toytree.rtree.unittree(8, treeheight=1e5, seed=123)

# simulate 10 unlinked SNPs on this tree
model1 = ipcoal.Model(tree=tre, Ne=2, seed=123)
model1.sim_snps(10)

In [5]:
# load gene trees to a multitree
mtre = toytree.mtree(model1.df.genealogy)

# draw the species tree
c, a = tre.draw(**tstyle, width=225, height=200);
c.text(c.width / 2, 20, "<b>Species tree</b>");

# draw the first 4 genealogies
c, a = mtre.draw_tree_grid(
    shared_axis=True,
    fixed_order=True,
    width=600, height=200, ncols=4, nrows=1,
    **tstyle)
c.text(c.width / 2, 20, "<b>Genealogies</b>");

### Simulate with high ILS

By setting the Ne value higher we expect to observe longer waiting times to coalescence, and likely coalescence events that predate population divergences such that ILS may occur. 

In [6]:
# generate species tree
tre = toytree.rtree.unittree(8, treeheight=1e5, seed=123)

# simulate 10 unlinked SNPs on model with high Ne
model2 = ipcoal.Model(tree=tre, Ne=1e5, seed=123)
model2.sim_snps(10)

In [7]:
# load gene trees to a multitree
mtre = toytree.mtree(model2.df.genealogy)

# draw the species tree
c, a = tre.draw(**tstyle, width=225, height=200);
c.text(c.width / 2, 20, "<b>Species tree</b>");

# draw the genealogies
c, a = mtre.draw_tree_grid(
    shared_axis=True,
    fixed_order=True,
    width=600, height=250, ncols=4, nrows=1,
    **tstyle)
c.text(c.width / 2, 20, "<b>Genealogies</b>");

### Simulate with variable Ne 

You can use **toytree** to set Ne values to vary among nodes by setting a feature to each node called "Ne". This is demonstrated below using the `.set_node_values()` function call. 

In [8]:
# init a random tree and draw it
tre = toytree.rtree.baltree(8, treeheight=1e6)

# set specific values to some nodes and a global default
tre = tre.set_node_values(
    attr="Ne", 
    values={8: 5e6, 9: 5e6, 12: 5e6},
    default=20,
)

In [9]:
# draw to show edge widths and node idxs
tre.draw(
    width=300, height=250,
    edge_widths=np.log(tre.get_edge_values("Ne")),
    tree_style='c',
    tip_labels=True,
);

As you can see in the tree above, the populations sizes are very small on one half of the tree, and very high on the other. This should lead to a characteristic genealogy shape with deep coalescences only for lineages 0, 1, 2, and 3. 

In [10]:
# init the model
model3 = ipcoal.Model(tree=tre, seed=12345)
model3.sim_snps(10)

In [11]:
# load gene trees to a multitree
mtre = toytree.mtree(model3.df.genealogy)

# draw the species tree
c, a = tre.draw(
    width=300, height=250,
    edge_widths=np.log(tre.get_edge_values("Ne")),
    **tstyle, 
);
c.text(c.width / 2, 20, "<b>Species tree</b>");

# draw the genealogies
c, a = mtre.draw_tree_grid(
    shared_axis=True,
    fixed_order=tre.get_tip_labels(),
    width=600, height=250, ncols=4, nrows=1,
    **tstyle)
c.text(c.width / 2, 20, "<b>Genealogies</b>");

## Interdependence of D-statistics