Skip to content

Commit a45898e

Browse files
authored
Merge pull request #4 from EtthelWindels/master
Update figures and precooked runs for BEAST2.7 compatibility
2 parents f506e56 + fcb1959 commit a45898e

30 files changed

+13335
-13407
lines changed

.DS_Store

6 KB
Binary file not shown.

README.md

+31-39
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ author: Denise Kühnert, Jūlija Pečerska
33
level: Professional
44
title: Structured birth-death model
55
subtitle: Population structure using the multi-type birth-death model
6-
beastversion: 2.5.2
7-
tracerversion: 1.7.0
6+
beastversion: 2.7.4
7+
tracerversion: 1.7.2
88
---
99

1010

@@ -46,7 +46,7 @@ TreeAnnotator is provided as a part of the BEAST2 package so you do not need to
4646

4747
### Tracer
4848

49-
Tracer ([http://tree.bio.ed.ac.uk/software/tracer](http://tree.bio.ed.ac.uk/software/tracer)) is used to summarize the posterior estimates of the various parameters sampled by the Markov Chain. This program can be used for visual inspection and to assess convergence. It helps to quickly view median estimates and 95% highest posterior density intervals of the parameters, and calculates the effective sample sizes (ESS) of parameters. It can also be used to investigate potential parameter correlations. We will be using Tracer v{{ page.tracerversion }}.
49+
Tracer ([https://github.com/beast-dev/tracer/releases/tag/v1.7.2](https://github.com/beast-dev/tracer/releases/tag/v1.7.2)) is used to summarize the posterior estimates of the various parameters sampled by the Markov Chain. This program can be used for visual inspection and to assess convergence. It helps to quickly view median estimates and 95% highest posterior density intervals of the parameters, and calculates the effective sample sizes (ESS) of parameters. It can also be used to investigate potential parameter correlations. We will be using Tracer v{{ page.tracerversion }}.
5050

5151
### IcyTree
5252

@@ -62,7 +62,7 @@ IcyTree ([https://icytree.org](https://icytree.org)) is a browser-based phylogen
6262
You can easily install the `bdmm` package via BEAUti's package manager. To do this, follow these steps:
6363

6464
1. Start BEAUti;
65-
2. In the application menu, `File > Manage packages`.
65+
2. In the application menu, `File > Manage Packages`.
6666
3. Find `bdmm` in the list of packages shown, select it and then click `Install/Upgrade`.
6767

6868
The BEAUTi window should look similar to what is shown in [Figure 1](#fig:install-bdmm).
@@ -82,7 +82,7 @@ If you get an error message stating that you are missing a package on which `bdm
8282

8383
# Setting up the analysis using BEAUti
8484

85-
## Loading the Template
85+
## Loading the template
8686

8787
A BEAUTi template defines the basic structure and contents of your XML configuration file.
8888
By default BEAUTi will construct an XML file with standard uncoloured BEAST trees, however `bdmm` uses coloured trees which are defined in the `MultiTypeTree` package.
@@ -98,23 +98,19 @@ To use the appropriate template for the configuration file, select `File > Templ
9898

9999
## Loading the data
100100

101-
Once the template is loaded, we can load in our example sequence data. In our case, this data is stored in a FASTA file, the first few lines of which look like this (the sequences have been truncated for better readability):
101+
Once the template is loaded, we can load in our example sequence data. In our case, these data are stored in a FASTA file, the first few lines of which look like this (the sequences have been truncated for better readability):
102102

103103
```
104104
> EU856841_HongKong_2005.34246575
105-
-----------GGGATAATTCTATTAACCATGAAGACTATCATTGCTTTGAGCTACATTT...
105+
-----------ATGAAGACTATCATTGCTTTGAGCTACATTCTATGTCTGGTTTTCGCTC...
106106
> EU856989_HongKong_2002.58356164
107-
--CAAAAGCAGGGGATAATTCTATTAACCATGAAGACTATCATTGCTTTGAGCTACATTT...
107+
-----------ATGAAGACTATCATTGCTTTGAGCTACATTCTATGTCTGGTTTTCGCTC...
108108
> CY039495_HongKong_2004.5890411
109-
------------------TTCTATTAACCATGAAGACTATCATTGCTTTGAGCTACATTC...
109+
-----------ATGAAGACTATCATTGCTTTGAGCTACATTCTATGTCTGGTTTTCGCTC...
110110
> EU856853_HongKong_2001.17808219
111-
---------------------TATTAACCATGAAGACTATCATTGCTTTGAGCTACATTC...
112-
> CY010084_NewZealand_2005.62739726
113-
---------------------TATTAACCATGAAGACTATCATTGCTTTGAGCTACATTC...
114-
> CY007387_NewZealand_2004.63287671
115-
---------------------TATTAACCATGAAGACTATCATTGCTTTGAGCTACATTC...
116-
> CY012432_NewZealand_2000.81643836
117-
---------------------------CCATGAAGACTATCATTGCTTTGAGCTACATTT...
111+
-----------ATGAAGACTATCATTGCTTTGAGCTACATTTTATGTCTGGTTTTCGCTC...
112+
> EU857026_HongKong_2003.51232877
113+
-----------ATGAAGGCTATCATTGCTTTGAGCTACATTCTATGTCTGGTTTTCGCTC...
118114
```
119115

120116
The lines beginning with ">" are labels for the sequences immediately
@@ -136,7 +132,7 @@ To set the working directory, select `File > Set working dir > MultiTypeTree`, a
136132
</figure>
137133
<br>
138134

139-
To load the file, select `File > Add alignment`.
135+
To load the file, select `File > Add Alignment`.
140136

141137
This will open a file selection dialog box. The example influenza sequence data
142138
file is named `h3n2_2deme.fna`.
@@ -189,7 +185,7 @@ Now that we've specified the sampling times, we move on to specifying the sampli
189185
To do this, we follow a very similar set of steps to those we used to set the sample times:
190186

191187
1. Select the `Tip Locations` panel. You'll find that the locations are already filled with a single default value – `NOT_SET`.
192-
2. Click the `Guess` button at the top-right of the panel. This opens the same dialog that we saw in the previous section when setting up the dates.
188+
2. Click the `Guess` button at the top-left of the panel. This opens the same dialog that we saw in the previous section when setting up the dates.
193189
3. The locations are included as the second element of the underscore-delimited sequence names.
194190
Therefore we choose the `split on character` radio button and select group `2` from the drop-down menu.
195191
Note again that the underscore character is already chosen as the delimiter.
@@ -215,19 +211,16 @@ The BEAUTi panel should look as shown in [Figure 8](#fig:tip-types-set).
215211

216212
## Setting the substitution model
217213

218-
For this analysis, we will use the HKY substitution model with 4 gamma categories and estimated base frequencies.
214+
For this analysis, we will use the JC69 substitution model with 4 gamma categories.
219215
To configure this in BEAUti, switch to the `Site Model` panel.
220216
First, we need to set up the rate category count.
221217
To approximate the continuous gamma rate distribution BEAST2 uses the discrete gamma distribution, where sites are divided into k equally probable rate categories.
222-
In general, 4-6 categories work well for most datasets, while having more categories involve a lot of computation at little precision gain, so we set the `Gamma category count` to 4.
218+
In general, 4-6 categories work well for most datasets, while having more categories involve a lot of computation at little precision gain, so we set the `Gamma Category Count` to 4.
223219
We would also like to estimate the `Shape` parameter, which describes the shape of the continuous gamma distribution we approximate.
224220
To do so, we need to set it to a non-zero value (e.g. the default 1.0) and tick the `estimate` checkbox.
225221
While the gamma categories account for rate variation, allowing some sites to have an evolutionary rate of 0 can improve fit to real data.
226222
To speed up the analysis we will fix this to the actual proportion of invariant sites we have in our alignment, which is 0.867.
227-
228-
Next, to set up the substitution model, select `HKY` from the drop-down menu (the default option is `JC69`).
229-
We would like to estimate the kappa parameter of HKY, so we leave the `Kappa` at the default value of 2.0 and leave the `estimate` checkbox checked.
230-
We would also like to estimate nucleotide frequencies, so we leave the `Frequencies` parameter at the default value (`Estimated`).
223+
We leave the substitution model to the default option `JC69`.
231224
The BEAUti panel should now look as shown in [Figure 9](#fig:site-model).
232225

233226
<figure>
@@ -257,7 +250,7 @@ The `Clock Model` panel should now look as shown in [Figure 10](#fig:strict-cloc
257250
</figure>
258251
<br>
259252

260-
## Adjusting Priors
253+
## Adjusting priors
261254

262255
### Setting up the `bdmm` tree prior
263256

@@ -381,7 +374,7 @@ You can see the sampling prior setup in [Figure 15](#fig:samplingProportion-prio
381374
<br>
382375

383376

384-
For the purpose of this tutorial and given that we know little about the outbreak in question to set strict priors on the `rateMatrix`, we will leave the other priors on the default values, but feel free to through them yourself and verify their sensibility.
377+
For the purpose of this tutorial and given that we know little about the outbreak in question to set strict priors on the `rateMatrix`, we will leave the other priors on the default values, but feel free to go through them yourself and verify their sensibility.
385378

386379
## Saving the configuration
387380

@@ -408,30 +401,29 @@ we'll use to assemble a summary tree.
408401

409402
## Parameter log file analysis
410403

411-
We can use the program [Tracer](http://tree.bio.ed.ac.uk/software/tracer/) to view the parameter log file.
412-
To do this, start Tracer and then press the `+` button in the top-left hand corner of the window (under `Trace files`).
413-
Select the log file for this analysis (`h3n2_2deme.log`) from the file selection dialog box.
404+
We can use the program [Tracer](https://github.com/beast-dev/tracer/releases/tag/v1.7.2) to view the parameter log file.
405+
To do this, start Tracer and then press the `+` button in the top-left hand corner of the window (under `Trace File`).
406+
Select the log file for this analysis (`h3n2-bdmm.log`) from the file selection dialog box.
414407
You can also simply drag your log file from the file browser to the Tracer window.
415408
The `Traces` table will then be populated with parameters and summary
416409
statistics corresponding to our multitype birth-death analysis.
417-
Note that the screen captures below were taken using Tracer 1.6 and may therefore slightly differ from what you see on screen.
418410

419411
Important traces are:
420412

421-
* `R0.t:h3n2_2deme1` and `R0.t:h3n2_2deme2`: These give the effective reproduction numbers for deme 1 (Hong Kong) and 2 (New Zealand), respectively.
413+
* `R0.t:h3n2_2deme.1` and `R0.t:h3n2_2deme.2`: These give the effective reproduction numbers for deme 1 (Hong Kong) and 2 (New Zealand), respectively.
422414

423-
* `becomeUninfectiousRate.t:h3n2_deme21` and `becomeUninfectiousRate.t:h3n2_deme22`: These are the rates of recovery for someone with flu in either of the locations.
415+
* `becomeUninfectiousRate.t:h3n2_2deme.1` and `becomeUninfectiousRate.t:h3n2_2deme.2`: These are the rates of recovery for someone with flu in either of the locations.
424416

425-
* `rateMatrix.t:h3n2_2deme1` and `rateMatrix.t:h3n2_2deme2`: These give the (per lineage per year) migration rates from deme 1 to 2 and vice versa.
417+
* `rateMatrix.t:h3n2_2deme.1` and `rateMatrix.t:h3n2_2deme.2`: These give the (per lineage per year) migration rates from deme 1 to 2 and vice versa.
426418

427419
* `Tree.t:h3n2_2deme.count_HongKong_to_NewZealand`: these give the number of ancestral migrations from Hong Kong to New Zealand on the inferred tree, **backwards in time**.
428420

429421
The tabs at the top-right of the window can be used to display one or more selected traces in various ways.
430-
We can look at the become uninfectious rate by selecting the `becomeUninfectiousRate.t:h3n2_2deme1` trace (see [Figure 16](#fig:tracer-bUR)).
431-
The 95% HPD for the parameter is quite wide ([18.2465, 93.2316]), which is most likely due to the fact that we have very little data, however the mean value is 50.102, which gives us an infectious period of 7.3 days.
432-
Next, selecting the two R<sub>0</sub> traces (`R0.t:h3n2_2deme1` and `R0.t:h3n2_2deme2`) and choosing the `Marginal prob distribution` panel results in useful comparison between the sampled population size marginal posterior distributions (see [Figure 17](#fig:tracer-R0)).
422+
We can look at the become uninfectious rate by selecting the `becomeUninfectiousRate.t:h3n2_2deme.1` trace (see [Figure 16](#fig:tracer-bUR)).
423+
The 95% HPD for the parameter is quite wide ([18.699, 88.5431]), which is most likely due to the fact that we have very little data, however the mean value is 49.3247, which gives us an infectious period of 7.4 days.
424+
Next, selecting the two R<sub>0</sub> traces (`R0.t:h3n2_2deme.1` and `R0.t:h3n2_2deme.2`) and choosing the `Marginal Density` panel results in useful comparison between the sampled population size marginal posterior distributions (see [Figure 17](#fig:tracer-R0)).
433425
Looking at the posterior distributions we can not see any significant difference in R<sub>0</sub> between the two demes.
434-
While the distributions are visibly different, they cover the same parameter range (deme 1 95% HPD interval [0.991, 1.0247], deme 2 95% HPD interval [0.9096, 1.0413]), so the values are indistinguishable through such analysis.
426+
While the distributions are visibly different, they cover the same parameter range (deme 1 95% HPD interval [0.9922, 1.0258], deme 2 95% HPD interval [0.9057, 1.038]), so the values are indistinguishable through such analysis.
435427

436428
<figure>
437429
<a id="fig:tracer-bUR"></a>
@@ -509,7 +501,7 @@ The setup can be seen in [Figure 20](#fig:TreeAnnotator-setup).
509501

510502
Pressing the `Run` button will produce an annotated summary tree.
511503

512-
To visualize this tree, open IcyTree once more (maybe open it in a new browser tab), choose `File > Open`, then select the file `h3n2_2deme.h3n2_2deme.summary.tree` using the file selection dialog.
504+
To visualize this tree, open IcyTree once more (maybe open it in a new browser tab), choose `File > Load from file`, then select the file `h3n2-bdmm.h3n2_2deme.summary.trees` using the file selection dialog.
513505
Follow the instructions provided above to colour the tree by the `type` attribute and add the legend and time axis.
514506
In addition, open the `Style` menu and select `Node height error bars > height_95%_HPD` to add error bars to the internal node heights.
515507
Finally, open the `Style` menu and select `Relative edge width > type.prob`.
@@ -528,7 +520,7 @@ Here we have a full consensus tree annotated by the locations at coalescence nod
528520
This is a much more comprehensive summary of the phylogenetic side of our analysis.
529521
One thing to pay attention to here is that the most probable root location in the summary tree is Hong Kong (under our model which assumes that only Hong Kong and New Zealand exist).
530522
Hovering the mouse cursor over the tiny edge above the root will bring up a table in which posterior probability of the displayed root location (`type.prob`) can be seen.
531-
In this analysis we see that it is about 88.8%.
523+
In this analysis we see that it is about 91%.
532524
The analysis therefore strongly supports a Hong Kong origin over a New Zealand origin for this flu sample.
533525

534526
<!--[Very useful final notes from Tim](https://github.com/CompEvol/MultiTypeTree/wiki/Beginner%27s-Tutorial-%28short-version%29#final-notes)-->

figures/1-install-bdmm.png

26.3 KB
Loading

figures/10-strict-clock.png

19.5 KB
Loading

figures/11-tree-prior.png

19 KB
Loading

figures/12-R0-prior.png

-158 KB
Loading

figures/13-bUR-prior.png

65.7 KB
Loading

figures/14-clock-rate-prior.png

45.6 KB
Loading
-181 KB
Loading

figures/16-tracer-bUR.png

-78.5 KB
Loading

figures/17-tracer-R0.png

-79.1 KB
Loading

figures/18-icyTree-trees.png

-375 KB
Loading

figures/19-icyTree-MAP.png

-369 KB
Loading

figures/2-choose-bdmm-template.png

60.5 KB
Loading

figures/20-TreeAnnotator-setup.png

7.74 KB
Loading

figures/21-icyTree-summary.png

-370 KB
Loading

figures/3-set-working-dir.png

-27.3 KB
Loading

figures/4-alignment-loaded.png

7.8 KB
Loading

figures/5-tip-dates.png

-184 Bytes
Loading

figures/6-tip-dates-set.png

2.91 KB
Loading

figures/7-tip-types.png

-64 Bytes
Loading

figures/8-tip-types-set.png

-73 KB
Loading

figures/9-sitemodel.png

15 KB
Loading

0 commit comments

Comments
 (0)