# Three-area DEC RevBayes

https://revbayes.github.io/tutorials/biogeo/biogeo_simple.html

![schematic](fig_range_Evol_events.png "schematic")

### Set working directory and input and output filenames

In [1]:
getwd()
setwd("/Users/cdoorenweerd/Documents/StackIP/Projects active/Saturnia/Saturnia_Biogeography/RevBayes_DEC2")

range_fn = "saturnia.n3.range.nex"
tree_fn = "20180606_Saturnia_BEAST_constrained2_ingroup_nocopaxa.nwk"
#times_fn = "saturnia.n4.times.txt"
dist_fn = "saturnia.n3.distances.txt"
connectivity_fn = "saturnia.n3.connectivity.1.txt"

out_fn = "three_area_DEC"
out_state_fn = out_fn + ".states.log"
out_tree_fn = out_fn + ".tre"
out_mcc_fn = out_fn + ".mcc.tre"

mvi = 1
mni = 1

   /usr/local/lib/python3.6/site-packages/revbayes_kernel-0.1.0-py3.6.egg/revbayes_kernel
Received directory:		/Users/cdoorenweerd/Documents/StackIP/Projects active/Saturnia/Saturnia_Biogeography/RevBayes_DEC2


### Read in geographical ranges

In [2]:
dat_range_01 = readDiscreteCharacterData(range_fn) # read character data as binary presence-absence
n_areas = dat_range_01.nchar() # record total number of areas
dat_range_01

max_areas <- 2
n_states <- 0
for (k in 0:max_areas) n_states += choose(n_areas, k)

dat_range_n = formatDiscreteCharacterData(dat_range_01, "DEC", n_states) # encode ranges into natural numbers for one character
n_areas
n_states

   Successfully read one character matrix from file 'saturnia.n4.range.nex'

   Standard character matrix with 33 taxa and 3 characters
   Origination:                   saturnia.n4.range.nex
   Number of taxa:                33
   Number of included taxa:       33
   Number of characters:          3
   Number of included characters: 3
   Datatype:                      Standard

   3
   7


Write legend for state labels and range states to file

In [3]:
state_desc = dat_range_n.getStateDescriptions()

state_desc_str = "state,range\n"
for (i in 1:state_desc.size()){state_desc_str += (i-1) + "," + state_desc[i] + "\n"}
write(state_desc_str, file=out_fn+".state_labels.txt")

### Read in dated molecular phylogenetic tree

In [4]:
tree <- readTrees(tree_fn)[1]
tree

   Attempting to read the contents of file "20180606_Saturnia_BEAST_constrained2_ingroup_nocopaxa.nwk"
   Successfully read file
  
   (((((((((((SARBA615_09_Rinaca_chinensis[&index=33]:3.513580,SARBC471_10_Rinaca_ruda_HQ973326[&index=32]:3.513580)[&index=34]:1.239046,SAUPA465_10_Rinaca_naumanni_JN278744[&index=31]:4.752626)[&index=35]:4.110972,(SARBA578_09_Rinaca_minshanensis[&index=30]:4.057900,SARBC468_10_Rinaca_florianii_HQ973323[&index=29]:4.057900)[&index=36]:4.805698)[&index=37]:0.625938,(((SARBA597_09_Rinaca_fujiana[&index=28]:1.349058,SARBA613_09_Rinaca_shaanxiana[&index=27]:1.349058)[&index=38]:3.796523,SARBE1506_13_Rinaca_dusii[&index=26]:5.145581)[&index=39]:1.342367,act16_Rinaca_jonasii[&index=25]:6.487948)[&index=40]:3.001588)[&index=41]:1.296271,act18_Rinaca_boisduvalii[&index=24]:10.785807)[&index=42]:0.391241,SARBB798_09_Rinaca_thibeta_GU664031[&index=23]:11.177048)[&index=43]:2.703537,(SASNB457_09_Rinaca_lesoudieri_GU702972[&index=22]:6.207512,SAWNA102_09_Rinaca_zulei

### Build DEC rate matrices and transition probability matrix

In [5]:
# read connectivity matrix
connectivity <- readDataDelimitedFile(file=connectivity_fn, delimiter=" ")

# read distances matrix
distances <- readDataDelimitedFile(file=dist_fn, delimiter=" ")

# parameter for arrival rate of anagenetic range evolution events
log10_rate_bg ~ dnUniform(-4,2)
log10_rate_bg.setValue(-2)
rate_bg := 10^log10_rate_bg
moves[mvi++] = mvSlide(log10_rate_bg, weight=4)

# dispersal rate matrix
dispersal_rate <- 1.0 # fix base rate of dispersal
distance_scale ~ dnUnif(0,20)
distance_scale.setValue(0.01)
moves[mvi++] = mvScale(distance_scale, weight=3)

for (j in 1:n_areas) {
  for (k in 1:n_areas) {
    dr[j][k] <- 0.0
    if (connectivity[j][k] > 0) {
      dr[j][k]  := dispersal_rate * exp(-distance_scale * distances[j][k])
    }
  }
}

# prior distribution for the relative extirpation rate and assign a move
log_sd <- 0.5
log_mean <- ln(1) - 0.5*log_sd^2
extirpation_rate ~ dnLognormal(mean=log_mean, sd=log_sd)
moves[2] = mvScale(extirpation_rate, weight=2)

for (j in 1:n_areas) {
  for (k in 1:n_areas) {
    er[j][k] <- 0.0
  }
  er[j][j] := extirpation_rate
}

# unify objects
Q_DEC := fnDECRateMatrix(dispersalRates=dr, extirpationRates=er, maxRangeSize=max_areas) # relative matrix

# cladogenetic transition probability matrix
clado_event_types <- [ "s", "a" ]
clado_event_probs <- simplex(1, 1)
P_DEC := fnDECCladoProbs(eventProbs=clado_event_probs,
                            eventTypes=clado_event_types,
                            numCharacters=n_areas,
                            maxRangeSize=max_areas)

# encapsulate all model components
m_bg ~ dnPhyloCTMCClado(tree=tree,
                           Q=Q_DEC,
                           cladoProbs=P_DEC,
                           branchRates=rate_bg,
                           nSites=1,
                           type="NaturalNumbers")

# attach observed ranges to the model
m_bg.clamp(dat_range_n)

### Run model

In [6]:
# add monitors that keep track of the MCMC
monitors[1] = mnScreen(rate_bg, extirpation_rate, printgen=100)
monitors[2] = mnModel(file=out_fn+".params.log", printgen=10)
monitors[3] = mnFile(tree, file=out_fn+".tre", printgen=10)
monitors[4] = mnJointConditionalAncestralState(tree=tree,
                                                    ctmc=m_bg,
                                                    filename=out_fn+".states.log",
                                                    type="NaturalNumbers",
                                                    printgen=10,
                                                    withTips=true,
                                                    withStartStates=true)

In [7]:
mymodel = model(m_bg)

In [8]:
mymcmc = mcmc(mymodel, moves, monitors)

In [9]:
mymcmc.run(25000)


   Running MCMC simulation
   This simulation runs 1 independent replicate.
   The simulator uses 2 different moves in a random move schedule with 6 moves per iteration

Iter        |      Posterior   |     Likelihood   |          Prior   |   extirpatio..   |        rate_bg   |    elapsed   |        ETA   |
------------------------------------------------------------------------------------------------------------------------------------------
0           |       -29.0726   |       -24.3084   |       -4.76418   |      0.6729202   |           0.01   |   00:00:00   |   --:--:--   |
100         |       -28.7534   |       -23.9854   |       -4.76797   |      0.7213904   |    0.008262751   |   00:00:02   |   --:--:--   |
200         |       -29.0195   |       -24.1738   |        -4.8457   |      0.8419831   |    0.002781873   |   00:00:04   |   00:08:16   |
300         |       -31.6813   |       -22.8836   |       -8.79767   |       2.844191   |    0.004063189   |   00:00:07   |   00:09:36

5200        |        -29.045   |       -24.2623   |       -4.78263   |      0.6229127   |    0.002851128   |   00:02:00   |   00:07:36   |
5300        |       -28.7443   |       -23.5899   |       -5.15443   |       1.069547   |    0.004705486   |   00:02:02   |   00:07:33   |
5400        |       -28.7341   |       -23.9303   |       -4.80379   |      0.5961216   |    0.005191164   |   00:02:05   |   00:07:33   |
5500        |       -29.4334   |       -24.1673   |       -5.26605   |       1.134713   |    0.002509163   |   00:02:07   |   00:07:30   |
5600        |       -29.1546   |       -23.7212   |       -5.43333   |       1.226069   |    0.003550096   |   00:02:09   |   00:07:26   |
5700        |       -28.6478   |        -23.864   |       -4.78376   |      0.7604691   |    0.004228265   |   00:02:11   |   00:07:23   |
5800        |       -29.4273   |       -23.2192   |       -6.20808   |       1.607917   |    0.005007605   |   00:02:14   |   00:07:23   |
5900        |       -30.026

10500       |       -30.2219   |       -25.1844   |       -5.03751   |      0.4745977   |    0.001436489   |   00:04:02   |   00:05:34   |
10600       |       -30.2135   |       -24.3643   |       -5.84923   |      0.3289444   |    0.007026517   |   00:04:05   |   00:05:32   |
10700       |       -29.2732   |       -23.3643   |       -5.90889   |       1.464972   |    0.004586153   |   00:04:07   |   00:05:30   |
10800       |       -28.6971   |       -23.8194   |       -4.87773   |       0.873028   |    0.008403906   |   00:04:09   |   00:05:27   |
10900       |       -29.4832   |       -24.7185   |       -4.76473   |      0.6690288   |    0.001905636   |   00:04:11   |   00:05:24   |
11000       |       -28.8538   |       -23.3777   |        -5.4761   |       1.248571   |    0.005786834   |   00:04:13   |   00:05:22   |
11100       |       -29.2694   |       -23.2847   |       -5.98461   |       1.501477   |    0.009567191   |   00:04:16   |   00:05:20   |
11200       |       -29.240



Iter        |      Posterior   |     Likelihood   |          Prior   |   extirpatio..   |        rate_bg   |    elapsed   |        ETA   |
------------------------------------------------------------------------------------------------------------------------------------------
16000       |       -28.6938   |       -23.9305   |       -4.76331   |      0.6849269   |    0.004140221   |   00:06:11   |   00:03:28   |
16100       |       -28.6042   |       -23.8409   |       -4.76329   |       0.688541   |    0.005654656   |   00:06:13   |   00:03:26   |
16200       |       -32.0275   |        -22.004   |       -10.0235   |       3.479065   |    0.007891943   |   00:06:15   |   00:03:23   |
16300       |       -29.1792   |       -23.6815   |       -5.49773   |       1.259846   |    0.003630649   |   00:06:18   |   00:03:21   |
16400       |       -29.2508   |       -23.1219   |        -6.1289   |       1.570387   |    0.007814579   |   00:06:20   |   00:03:19   |
16500       |       -28.8

21300       |       -28.8894   |       -24.1121   |        -4.7773   |      0.6321024   |    0.003346203   |   00:08:11   |   00:01:25   |
21400       |       -30.2796   |       -24.3575   |       -5.92209   |       1.471359   |     0.00201817   |   00:08:14   |   00:01:23   |
21500       |       -29.2912   |       -23.9541   |       -5.33706   |        1.17424   |    0.002937269   |   00:08:16   |   00:01:20   |
21600       |       -32.3798   |       -22.3155   |       -10.0643   |       3.500986   |     0.01490069   |   00:08:18   |   00:01:18   |
21700       |       -29.3664   |        -24.548   |       -4.81843   |      0.5821325   |     0.01051622   |   00:08:20   |   00:01:16   |
21800       |       -29.1863   |       -23.8517   |        -5.3346   |       1.172887   |     0.00321542   |   00:08:23   |   00:01:13   |
21900       |       -28.7184   |       -23.9399   |       -4.77846   |      0.6299502   |    0.004435846   |   00:08:26   |   00:01:11   |


Iter        |      Poster

### Write results to output files

In [10]:
tree_trace = readTreeTrace(file=out_tree_fn, treetype="clock")
tree_trace.setBurnin(0.25) # gratuitous when phylogeny is fixed
n_burn = tree_trace.getBurnin()

mcc_tree = mccTree(tree_trace, file=out_mcc_fn)
state_trace = readAncestralStateTrace(file=out_state_fn)
tree_trace = readAncestralStateTreeTrace(file=out_tree_fn, treetype="clock")

anc_tree = ancestralStateTree(tree=mcc_tree,
                              ancestral_state_trace_vector=state_trace,
                              tree_trace=tree_trace,
                              include_start_states=true,
                              file=out_tree_fn+".ase.tre",
                              burnin=n_burn,
                              site=1)

   Processing file "/Users/cdoorenweerd/Documents/StackIP/Projects active/Saturnia/Saturnia_Biogeography/RevBayes_DEC2/four_area_DEC.tre"

Progress:
0---------------25---------------50---------------75--------------100
********************************************************************

   Compiling maximum clade credibility tree from 2501 trees in tree trace, using a burnin of 625 trees.
   Summarizing clades ...

Progress:
0---------------25---------------50---------------75--------------100
********************************************************************

   Annotating tree ...
Processing file "/Users/cdoorenweerd/Documents/StackIP/Projects active/Saturnia/Saturnia_Biogeography/RevBayes_DEC2/four_area_DEC.states.log"
Processing file "/Users/cdoorenweerd/Documents/StackIP/Projects active/Saturnia/Saturnia_Biogeography/RevBayes_DEC2/four_area_DEC.tre"
   Compiling MAP ancestral states from 2501 samples in the ancestral state trace, using a burnin of 625 samples.
   Calculating an

### Calculate marginal likelihood

Using stepping-stone simulation and path sampling

In [11]:
pow_p = powerPosterior(mymodel, moves, monitors, "model.out", cats=50) 
pow_p.burnin(generations=10000,tuningInterval=1000)


Running burn-in phase of Power Posterior sampler for 10000 iterations.
The simulator uses 2 different moves in a random move schedule with 6 moves per iteration


Progress:
0---------------25---------------50---------------75--------------100
********************************************************************


In [12]:
pow_p.run(generations=1000)


Running power posterior analysis ...
Step  1 / 51		****************************************
Step  2 / 51		****************************************
Step  3 / 51		****************************************
Step  4 / 51		****************************************
Step  5 / 51		****************************************
Step  6 / 51		****************************************
Step  7 / 51		****************************************
Step  8 / 51		****************************************
Step  9 / 51		****************************************
Step 10 / 51		****************************************
Step 11 / 51		****************************************
Step 12 / 51		****************************************
Step 13 / 51		****************************************
Step 14 / 51		****************************************
Step 15 / 51		****************************************
Step 16 / 51		****************************************
Step 17 / 51		****************************************
Step 18 / 51		*************

In [13]:
ss = steppingStoneSampler(file="model.out", powerColumnName="power", likelihoodColumnName="likelihood")

In [14]:
ss.marginal()

   -25.44625


In [15]:
ps = pathSampler(file="model.out", powerColumnName="power", likelihoodColumnName="likelihood")

In [16]:
ps.marginal() 

   -25.45067
