You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+31-39
Original file line number
Diff line number
Diff line change
@@ -3,8 +3,8 @@ author: Denise Kühnert, Jūlija Pečerska
3
3
level: Professional
4
4
title: Structured birth-death model
5
5
subtitle: Population structure using the multi-type birth-death model
6
-
beastversion: 2.5.2
7
-
tracerversion: 1.7.0
6
+
beastversion: 2.7.4
7
+
tracerversion: 1.7.2
8
8
---
9
9
10
10
@@ -46,7 +46,7 @@ TreeAnnotator is provided as a part of the BEAST2 package so you do not need to
46
46
47
47
### Tracer
48
48
49
-
Tracer ([http://tree.bio.ed.ac.uk/software/tracer](http://tree.bio.ed.ac.uk/software/tracer)) is used to summarize the posterior estimates of the various parameters sampled by the Markov Chain. This program can be used for visual inspection and to assess convergence. It helps to quickly view median estimates and 95% highest posterior density intervals of the parameters, and calculates the effective sample sizes (ESS) of parameters. It can also be used to investigate potential parameter correlations. We will be using Tracer v{{ page.tracerversion }}.
49
+
Tracer ([https://github.com/beast-dev/tracer/releases/tag/v1.7.2](https://github.com/beast-dev/tracer/releases/tag/v1.7.2)) is used to summarize the posterior estimates of the various parameters sampled by the Markov Chain. This program can be used for visual inspection and to assess convergence. It helps to quickly view median estimates and 95% highest posterior density intervals of the parameters, and calculates the effective sample sizes (ESS) of parameters. It can also be used to investigate potential parameter correlations. We will be using Tracer v{{ page.tracerversion }}.
50
50
51
51
### IcyTree
52
52
@@ -62,7 +62,7 @@ IcyTree ([https://icytree.org](https://icytree.org)) is a browser-based phylogen
62
62
You can easily install the `bdmm` package via BEAUti's package manager. To do this, follow these steps:
63
63
64
64
1. Start BEAUti;
65
-
2. In the application menu, `File > Manage packages`.
65
+
2. In the application menu, `File > Manage Packages`.
66
66
3. Find `bdmm` in the list of packages shown, select it and then click `Install/Upgrade`.
67
67
68
68
The BEAUTi window should look similar to what is shown in [Figure 1](#fig:install-bdmm).
@@ -82,7 +82,7 @@ If you get an error message stating that you are missing a package on which `bdm
82
82
83
83
# Setting up the analysis using BEAUti
84
84
85
-
## Loading the Template
85
+
## Loading the template
86
86
87
87
A BEAUTi template defines the basic structure and contents of your XML configuration file.
88
88
By default BEAUTi will construct an XML file with standard uncoloured BEAST trees, however `bdmm` uses coloured trees which are defined in the `MultiTypeTree` package.
@@ -98,23 +98,19 @@ To use the appropriate template for the configuration file, select `File > Templ
98
98
99
99
## Loading the data
100
100
101
-
Once the template is loaded, we can load in our example sequence data. In our case, this data is stored in a FASTA file, the first few lines of which look like this (the sequences have been truncated for better readability):
101
+
Once the template is loaded, we can load in our example sequence data. In our case, these data are stored in a FASTA file, the first few lines of which look like this (the sequences have been truncated for better readability):
The lines beginning with ">" are labels for the sequences immediately
@@ -136,7 +132,7 @@ To set the working directory, select `File > Set working dir > MultiTypeTree`, a
136
132
</figure>
137
133
<br>
138
134
139
-
To load the file, select `File > Add alignment`.
135
+
To load the file, select `File > Add Alignment`.
140
136
141
137
This will open a file selection dialog box. The example influenza sequence data
142
138
file is named `h3n2_2deme.fna`.
@@ -189,7 +185,7 @@ Now that we've specified the sampling times, we move on to specifying the sampli
189
185
To do this, we follow a very similar set of steps to those we used to set the sample times:
190
186
191
187
1. Select the `Tip Locations` panel. You'll find that the locations are already filled with a single default value – `NOT_SET`.
192
-
2. Click the `Guess` button at the top-right of the panel. This opens the same dialog that we saw in the previous section when setting up the dates.
188
+
2. Click the `Guess` button at the top-left of the panel. This opens the same dialog that we saw in the previous section when setting up the dates.
193
189
3. The locations are included as the second element of the underscore-delimited sequence names.
194
190
Therefore we choose the `split on character` radio button and select group `2` from the drop-down menu.
195
191
Note again that the underscore character is already chosen as the delimiter.
@@ -215,19 +211,16 @@ The BEAUTi panel should look as shown in [Figure 8](#fig:tip-types-set).
215
211
216
212
## Setting the substitution model
217
213
218
-
For this analysis, we will use the HKY substitution model with 4 gamma categories and estimated base frequencies.
214
+
For this analysis, we will use the JC69 substitution model with 4 gamma categories.
219
215
To configure this in BEAUti, switch to the `Site Model` panel.
220
216
First, we need to set up the rate category count.
221
217
To approximate the continuous gamma rate distribution BEAST2 uses the discrete gamma distribution, where sites are divided into k equally probable rate categories.
222
-
In general, 4-6 categories work well for most datasets, while having more categories involve a lot of computation at little precision gain, so we set the `Gamma category count` to 4.
218
+
In general, 4-6 categories work well for most datasets, while having more categories involve a lot of computation at little precision gain, so we set the `Gamma Category Count` to 4.
223
219
We would also like to estimate the `Shape` parameter, which describes the shape of the continuous gamma distribution we approximate.
224
220
To do so, we need to set it to a non-zero value (e.g. the default 1.0) and tick the `estimate` checkbox.
225
221
While the gamma categories account for rate variation, allowing some sites to have an evolutionary rate of 0 can improve fit to real data.
226
222
To speed up the analysis we will fix this to the actual proportion of invariant sites we have in our alignment, which is 0.867.
227
-
228
-
Next, to set up the substitution model, select `HKY` from the drop-down menu (the default option is `JC69`).
229
-
We would like to estimate the kappa parameter of HKY, so we leave the `Kappa` at the default value of 2.0 and leave the `estimate` checkbox checked.
230
-
We would also like to estimate nucleotide frequencies, so we leave the `Frequencies` parameter at the default value (`Estimated`).
223
+
We leave the substitution model to the default option `JC69`.
231
224
The BEAUti panel should now look as shown in [Figure 9](#fig:site-model).
232
225
233
226
<figure>
@@ -257,7 +250,7 @@ The `Clock Model` panel should now look as shown in [Figure 10](#fig:strict-cloc
257
250
</figure>
258
251
<br>
259
252
260
-
## Adjusting Priors
253
+
## Adjusting priors
261
254
262
255
### Setting up the `bdmm` tree prior
263
256
@@ -381,7 +374,7 @@ You can see the sampling prior setup in [Figure 15](#fig:samplingProportion-prio
381
374
<br>
382
375
383
376
384
-
For the purpose of this tutorial and given that we know little about the outbreak in question to set strict priors on the `rateMatrix`, we will leave the other priors on the default values, but feel free to through them yourself and verify their sensibility.
377
+
For the purpose of this tutorial and given that we know little about the outbreak in question to set strict priors on the `rateMatrix`, we will leave the other priors on the default values, but feel free to go through them yourself and verify their sensibility.
385
378
386
379
## Saving the configuration
387
380
@@ -408,30 +401,29 @@ we'll use to assemble a summary tree.
408
401
409
402
## Parameter log file analysis
410
403
411
-
We can use the program [Tracer](http://tree.bio.ed.ac.uk/software/tracer/) to view the parameter log file.
412
-
To do this, start Tracer and then press the `+` button in the top-left hand corner of the window (under `Trace files`).
413
-
Select the log file for this analysis (`h3n2_2deme.log`) from the file selection dialog box.
404
+
We can use the program [Tracer](https://github.com/beast-dev/tracer/releases/tag/v1.7.2) to view the parameter log file.
405
+
To do this, start Tracer and then press the `+` button in the top-left hand corner of the window (under `Trace File`).
406
+
Select the log file for this analysis (`h3n2-bdmm.log`) from the file selection dialog box.
414
407
You can also simply drag your log file from the file browser to the Tracer window.
415
408
The `Traces` table will then be populated with parameters and summary
416
409
statistics corresponding to our multitype birth-death analysis.
417
-
Note that the screen captures below were taken using Tracer 1.6 and may therefore slightly differ from what you see on screen.
418
410
419
411
Important traces are:
420
412
421
-
*`R0.t:h3n2_2deme1` and `R0.t:h3n2_2deme2`: These give the effective reproduction numbers for deme 1 (Hong Kong) and 2 (New Zealand), respectively.
413
+
*`R0.t:h3n2_2deme.1` and `R0.t:h3n2_2deme.2`: These give the effective reproduction numbers for deme 1 (Hong Kong) and 2 (New Zealand), respectively.
422
414
423
-
*`becomeUninfectiousRate.t:h3n2_deme21` and `becomeUninfectiousRate.t:h3n2_deme22`: These are the rates of recovery for someone with flu in either of the locations.
415
+
*`becomeUninfectiousRate.t:h3n2_2deme.1` and `becomeUninfectiousRate.t:h3n2_2deme.2`: These are the rates of recovery for someone with flu in either of the locations.
424
416
425
-
*`rateMatrix.t:h3n2_2deme1` and `rateMatrix.t:h3n2_2deme2`: These give the (per lineage per year) migration rates from deme 1 to 2 and vice versa.
417
+
*`rateMatrix.t:h3n2_2deme.1` and `rateMatrix.t:h3n2_2deme.2`: These give the (per lineage per year) migration rates from deme 1 to 2 and vice versa.
426
418
427
419
*`Tree.t:h3n2_2deme.count_HongKong_to_NewZealand`: these give the number of ancestral migrations from Hong Kong to New Zealand on the inferred tree, **backwards in time**.
428
420
429
421
The tabs at the top-right of the window can be used to display one or more selected traces in various ways.
430
-
We can look at the become uninfectious rate by selecting the `becomeUninfectiousRate.t:h3n2_2deme1` trace (see [Figure 16](#fig:tracer-bUR)).
431
-
The 95% HPD for the parameter is quite wide ([18.2465, 93.2316]), which is most likely due to the fact that we have very little data, however the mean value is 50.102, which gives us an infectious period of 7.3 days.
432
-
Next, selecting the two R<sub>0</sub> traces (`R0.t:h3n2_2deme1` and `R0.t:h3n2_2deme2`) and choosing the `Marginal prob distribution` panel results in useful comparison between the sampled population size marginal posterior distributions (see [Figure 17](#fig:tracer-R0)).
422
+
We can look at the become uninfectious rate by selecting the `becomeUninfectiousRate.t:h3n2_2deme.1` trace (see [Figure 16](#fig:tracer-bUR)).
423
+
The 95% HPD for the parameter is quite wide ([18.699, 88.5431]), which is most likely due to the fact that we have very little data, however the mean value is 49.3247, which gives us an infectious period of 7.4 days.
424
+
Next, selecting the two R<sub>0</sub> traces (`R0.t:h3n2_2deme.1` and `R0.t:h3n2_2deme.2`) and choosing the `Marginal Density` panel results in useful comparison between the sampled population size marginal posterior distributions (see [Figure 17](#fig:tracer-R0)).
433
425
Looking at the posterior distributions we can not see any significant difference in R<sub>0</sub> between the two demes.
434
-
While the distributions are visibly different, they cover the same parameter range (deme 1 95% HPD interval [0.991, 1.0247], deme 2 95% HPD interval [0.9096, 1.0413]), so the values are indistinguishable through such analysis.
426
+
While the distributions are visibly different, they cover the same parameter range (deme 1 95% HPD interval [0.9922, 1.0258], deme 2 95% HPD interval [0.9057, 1.038]), so the values are indistinguishable through such analysis.
435
427
436
428
<figure>
437
429
<a id="fig:tracer-bUR"></a>
@@ -509,7 +501,7 @@ The setup can be seen in [Figure 20](#fig:TreeAnnotator-setup).
509
501
510
502
Pressing the `Run` button will produce an annotated summary tree.
511
503
512
-
To visualize this tree, open IcyTree once more (maybe open it in a new browser tab), choose `File > Open`, then select the file `h3n2_2deme.h3n2_2deme.summary.tree` using the file selection dialog.
504
+
To visualize this tree, open IcyTree once more (maybe open it in a new browser tab), choose `File > Load from file`, then select the file `h3n2-bdmm.h3n2_2deme.summary.trees` using the file selection dialog.
513
505
Follow the instructions provided above to colour the tree by the `type` attribute and add the legend and time axis.
514
506
In addition, open the `Style` menu and select `Node height error bars > height_95%_HPD` to add error bars to the internal node heights.
515
507
Finally, open the `Style` menu and select `Relative edge width > type.prob`.
@@ -528,7 +520,7 @@ Here we have a full consensus tree annotated by the locations at coalescence nod
528
520
This is a much more comprehensive summary of the phylogenetic side of our analysis.
529
521
One thing to pay attention to here is that the most probable root location in the summary tree is Hong Kong (under our model which assumes that only Hong Kong and New Zealand exist).
530
522
Hovering the mouse cursor over the tiny edge above the root will bring up a table in which posterior probability of the displayed root location (`type.prob`) can be seen.
531
-
In this analysis we see that it is about 88.8%.
523
+
In this analysis we see that it is about 91%.
532
524
The analysis therefore strongly supports a Hong Kong origin over a New Zealand origin for this flu sample.
533
525
534
526
<!--[Very useful final notes from Tim](https://github.com/CompEvol/MultiTypeTree/wiki/Beginner%27s-Tutorial-%28short-version%29#final-notes)-->
0 commit comments