-
Notifications
You must be signed in to change notification settings - Fork 4
/
coala-install.Rmd
111 lines (87 loc) · 4.44 KB
/
coala-install.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
title: "Using External Simulators"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Using External Simulators}
%\VignetteEngine{knitr::rmarkdown}
%\usepackage[utf8]{inputenc}
---
Coala can call the coalescent simulators _ms_[1], _msms_[2] and _scrm_[3]
and can use _seq-gen_[4] for finite sites simulations.
The R version of _scrm_ should get installed automatically as a dependeny of
coala. For the other programs, you need to have an executable binary available
on your system.
# Installation
Short instructions on obtaining and compiling the programs are
given in the help pages of `activate_ms`, `activate_msms` and `activate_seqgen`.
More detailed instructions are provided in [the wiki](https://github.com/statgenlmu/coala/wiki/Installation).
# Activation
In addition to providing the binary for a simulator, you need to inform `coala`
where the binary is. We refer to this process as _activation_ of a binary.
Afterwards, `coala` will use the simulator automatically
where-ever appropriate.
There are three different ways to activate a binary:
1. Use the `activate_msms` and `activate_seqgen` functions to
activate the simulators from within R. You should use the functions before
creating a model.
2. Alternatively, you can place the binaries in your working
directory or in a folder listed in your _PATH_ environment variable using
one of the names listed under "Expected Binary Names" below. If there is
a match file, coala will automatically activate the simulator.
3. You can start the R session with an environment variable
that hold the path to the binaries. In this case, the simulators should
also be automatically be activated when the coala package is loaded.
4. Coala uses the R versions of `scrm` and `ms`. `scrm` should alawys be
available. Install the CRAN package `phyclust` to use `ms`.
| Simulator | Priority | Expected Binary Names | Environment Var | Function |
| --- | --- | --- | --- | --- |
| seq-gen | 100 | seqgen, seq-gen, seqgen.exe, seq-gen.exe | SEQGEN | activate_seqgen |
| msms | 200 | msms.jar / java, java.exe | MSMS / JAVA | activate_msms |
| ms | 300 | | | activate_ms |
| scrm | 400 | | | |
You can use the `list_simulators()` command to view which
simulators are currently available:
```{r size="small"}
library(coala)
list_simulators()
```
# Priority
The `check_model` function checks which simulators support a specific model,
and states the problems which coala has detected with the simulators that do not
support it.
For example, a simple model with infinite-sites mutations (IFS) can be simulated
with `scrm` or -- if installed -- with `ms` and `msms`, but not with `seq-gen`
because the latter generates finite-sites mutations:
```{r}
model <- coal_model(10, 1) +
feat_mutation(5, model = "IFS") +
sumstat_nucleotide_div()
check_model(model)
model
```
If multiple simulators can simulate a model, the one with the highest
priority is used. In our example, that is `scrm`. If we would like to use
`ms` instead, we need to raise its priority:
```{r eval=FALSE}
activate_ms(priority = 500)
```
# References
* __[1]__: Richard R. Hudson.
_Generating samples under a Wright-Fisher neutral model of genetic variation._
Bioinformatics (2002) 18 (2): 337-338
[10.1093/bioinformatics/18.2.337](https://doi.org/10.1093/bioinformatics/18.2.337)
* __[2]__: Gregory Ewing and Joachim Hermisson.
_MSMS: a coalescent simulation program including recombination,
demographic structure and selection at a single locus._
Bioinformatics (2010) 26 (16): 2064-2065
[10.1093/bioinformatics/btq322](https://doi.org/10.1093/bioinformatics/btq322)
* __[3]__: Paul R. Staab, Sha Zhu, Dirk Metzler and Gerton Lunter.
_scrm: efficiently simulating long sequences using the approximated
coalescent with recombination._
Bioinformatics (2015) 31 (10): 1680-1682.
[10.1093/bioinformatics/btu861](https://doi.org/doi:10.1093/bioinformatics/btu861)
* __[4]__: Andrew Rambaut and Nicholas C. Grassly.
_Seq-Gen: an application for the Monte Carlo simulation of DNA sequence
evolution along phylogenetic trees._
Comput Appl Biosci (1997) 13 (3): 235-238
[10.1093/bioinformatics/13.3.235](https://doi.org/10.1093/bioinformatics/13.3.235)