# Simulate continuous traits

Simulate one or more continuous traits under one or more models of trait evolution. 

In [1]:
import toytree

In [2]:
# tree used in examples
tree = toytree.rtree.unittree(ntips=6, treeheight=1.0, seed=123)

## Brownian motion

The amount of change in a continuous trait over a given time interval can be modeled under Brownian motion as the result of a random walk. At each time step the value changes by an amount randomly sampled from a normal distribution with mean=0 and variance described by an evolutionary rate parameter, $\sigma^2$. To model the change over an interval of time we can simply sample a random value from a normal distribution with mean=0 and variance as the product of the length of time and the rate parameter ($\sigma^2 t$).

### simulate_continuous_bm
Simulated traits are labeled t0-tN for N traits, unless the rates arg is entered as a mapping (e.g., dict) in which case traits can be given custom names. By default, simulated data are stored to Node objects of the input tree, and can be fetched by calling `tree.get_node_data()`. However, you can alternatively use the argument `df=True` to instead return simulated data in a DataFrame.

In [3]:
# call from the module-level API
trait = toytree.pcm.simulate_continuous_bm(tree, rates=1.0)

# call from the tree-level API (equivalent to above)
trait = tree.pcm.simulate_continuous_bm(rates=1.0);

#### rates
The rates takes one or more $\sigma^2$ evolutionary rate parameters. Note that the variance in a trait over the length of a branch is a product of the rate parameter and branch length, and thus you should take into account the branch length units of your tree when selecting rate parameters.

In [4]:
# simulate one trait on the tree
toytree.pcm.simulate_continuous_bm(tree, rates=1.0)

Unnamed: 0,t0
0,-0.483718
1,0.092763
2,0.577297
3,-0.535492
4,-0.413495
5,-0.920901
6,-0.193297
7,-0.066605
8,-0.069617
9,-0.301492


In [5]:
# simulate three traits with different sigma2 params
toytree.pcm.simulate_continuous_bm(tree, rates=[1.0, 2.0, 3.0])

Unnamed: 0,t0,t1,t2
0,-0.255294,-1.779344,-1.148281
1,-0.22626,-0.687899,-2.897875
2,0.307016,1.29452,-1.716461
3,0.294204,-0.106678,-2.083741
4,1.104496,-3.271169,1.061524
5,-0.271915,3.139083,-0.012942
6,-0.268827,-0.469807,-1.955259
7,0.184441,0.934781,-0.764027
8,-0.473523,-0.104635,-1.052509
9,0.206326,0.364401,0.465843


In [6]:
# use a dict to assign custom names to traits
toytree.pcm.simulate_continuous_bm(tree, rates={"size": 1.0, "speed": 5.0})

Unnamed: 0,size,speed
0,-0.913704,1.379901
1,-0.88611,-0.754444
2,-1.370901,-1.273465
3,-0.963715,0.728306
4,-0.076888,0.016406
5,0.576823,-4.247222
6,-0.656268,0.038858
7,-1.143345,-0.448983
8,-0.602107,0.749318
9,0.03922,-1.395102


#### tips_only
The data simulated above includes a trait value for every node in the tree, including internal nodes. However, in many cases we may be only interested in the traits at the tips of the tree. The argument `tips_only` will return on the simulated values for the tip nodes. (Note that the simulation process requires generating values for internal nodes, so you are effectively discarding that information when using this option, but it can be useful to keep things tidy). 

In [7]:
# simulate traits and store only for the tips
toytree.pcm.simulate_continuous_bm(tree, rates=[1.0], tips_only=True)

Unnamed: 0,t0
0,-0.162192
1,0.043205
2,0.08652
3,0.038125
4,0.094811
5,-0.609802


#### root_states
You can set the root state for one or more simulated traits using the option `root_states`. The default root_state is 0. You can see this in the first few simulations above where the root node (node 10) has a value of 0.0 for each trait. Below we simulate the same tree traits but with different starting (root) values.

In [8]:
# simulate three traits with different sigma2 params
toytree.pcm.simulate_continuous_bm(tree, rates=[1.0, 2.0, 3.0], root_states=[10, 12, 50])

Unnamed: 0,t0,t1,t2
0,8.574025,11.152688,50.883332
1,9.075534,12.082495,49.803262
2,8.569308,12.342059,49.965491
3,8.444484,11.8115,49.98399
4,9.648971,11.544169,54.604667
5,9.99471,8.69583,48.073702
6,8.97049,11.645473,50.246767
7,8.69846,11.940916,50.645462
8,9.127228,11.998511,50.72211
9,10.253479,10.994466,50.205961


#### inplace
By default the simulate data are returned in a pandas DataFrame where the index corresponds to the numeric idx labels of Nodes in the tree. Alternatively, you can use `inplace=True` to store the simulated traits as one or more features saved to Nodes of the tree.

In [9]:
# save simulated traits to the ToyTree
toytree.pcm.simulate_continuous_bm(tree, rates=[1.0, 2.0], inplace=True)

# fetch simulated trait feature data from the tree
tree.get_node_data(["t0", "t1"])

Unnamed: 0,t0,t1
0,-0.247272,2.879741
1,0.411316,1.757141
2,0.510895,1.939581
3,0.395498,0.945574
4,-0.172384,-1.08429
5,1.343654,0.871966
6,0.131007,1.96902
7,0.345196,1.79786
8,0.310709,1.382211
9,0.306519,-0.688091


One motivation for this option is that it makes it very easy to visualize the traits on a tree drawing, where you can select the traits by name rather than entering in the trait variable. Here we use color mapping to draw node colors scaled to the Greys colormap.

In [14]:
# draw the tree and show trait t0 values 
tree.draw(node_sizes=10, node_colors=("t0", "Greys"), node_mask=False, label="trait 't0'");

## Multivariate Brownian motion
To simulate traits with correlated evolution you can enter a variance-covariance matrix for the `rates` option. This can be be a list of lists, numpy array, or pandas DataFrame. 

In [11]:
# TODO