# Simulation
Coalescent simulations in *ipcoal* are performed using function calls from *msprime* which stores results in the TreeSequence format implemented in *tskit*. We strive to keep *ipcoal* up to date with new versions of *msprime* and *tskit* and to implement requested features like new substitution models, rate maps, or demographic modelling functions. However, we do not aim to implement every feature of *msprime*, as that would be redundant. In *ipcoal*, users can also access `TreeSequence` objects as the result of simulations (see Interaction with tskit), but in our default setting these objects are discarded and only a summarized tabular result is stored. In this way, `ipcoal` should be viewed a complementary tool to *msprime* and *tskit*, not as a replacement. It relies heavily on these tools for simulations, however, *ipcoal* has an entirely separate code base for our data analysis tools (see Phylogenetic Inference and Likelihood).

In [17]:
import ipcoal
import toytree

In [45]:
tre = toytree.rtree.unittree(5)
mtre = toytree.mtree([tre.mod.edges_slider(root=True) for i in range(100)])
c, a, m = mtre.draw_cloud_tree(scale_bar=True);
mtre[0].annotate.add_axes_scale_bar(a);

In [37]:
maxtre = max(mtre, key=lambda x: x[-1].height)

In [38]:
maxtre.draw();

In [2]:
model = ipcoal.Model(Ne=1e5, nsamples=10, mut=1e-8, recomb=1e-9)

### Simulation functions
An `ipcoal.Model` object has three methods for coalescent simulation, `sim_trees()`, `sim_loci()`, and `sim_snps` (see Simulation Functions). Each of these serves a different purpose, and accepts a number of arguments to modify its behavior. Under the hood, they represent different algorithm that make function calls in *msprime*. If you intend to set up highly complex simulations it may often be advantageous to perform your simulations in *msprime* directly, rather than using *ipcoal*. The main advantages of *ipcoal* come from the use of these functions, and from its more limited scope, and miminalist ethos, which make it easier to simulate and analyze data focused on phylogenetic trees (e.g., newick trees or sequence alignments).

`sim_ancestry` and `sim_mutation`

#### sim_trees
The `sim_trees` function is the simplest and fastest simulation function. It generates only coalescent trees as a result, and does not perform mutations. It takes two arguments, `nloci` and `nsites`. In *ipcoal* we always treat loci as being independent of one another. You can think of them as separate chromosomes. The length of each locus is represented by some number of sites. To simulate completely unlinked genealogies we can request the genealogy from a single site (nsites=1) from multiple independent loci. If you set nsites > 1, and the `Model`'s recombination rate is >0, then recombination events can occur within a locus, giving rise to multiple linked genealogies  (i.e., known as a tree sequence or ARG).

### Parallelization
You can parallelize `sim_trees` by setting the argument `nproc` to a value >1, in which case each locus (TreeSequence) will be simulated independently on a different processor. Parallel results are still  reproducible using `seed_trees` among runs that use the same `nproc` setting, but not currently between run using different `nproc` (TODO).

In [11]:
%%timeit
model.sim_trees(1e4, nproc=4)

11.3 s ± 105 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [12]:
%%timeit
model.sim_trees(1e4, nproc=1)

19.6 s ± 312 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### store_tree_sequences

In [55]:
# use arg to store/not-store TreeSequences
model = ipcoal.Model(Ne=1e5, nsamples=10, store_tree_sequences=False)

# by default TreeSequences are not stored, to save memory
model.sim_trees(nloci=2, nsites=1e5)
model.ts_dict

{}

In [56]:
# you can also change this setting after initialization
model.store_tree_sequences = True

# each locus is an independent TreeSequence simulation
model.sim_trees(nloci=2, nsites=1e5)
model.ts_dict

{0: <tskit.trees.TreeSequence at 0x7ff988d137c0>,
 1: <tskit.trees.TreeSequence at 0x7ff988a4dc90>}

### get_tree_sequence()

In [57]:
# def get_tree_sequence(nsites):
#     treeseq = ms.sim_ancestry(..., # attributes of the Model)
#     mutated_ts = ms.sim_mutations(treeseq, ... # attributes of the Model)
#     return mutated_ts

In [62]:
ts = model.get_tree_sequence(10)
ts.trees()

<tskit.trees.TreeIterator at 0x7ff988a4e5c0>