/
continuous.Rmd
889 lines (658 loc) · 44 KB
/
continuous.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
---
title: "Spread of a pathogen in a continuous space"
author: "Sebastian Lequime"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Spread of a pathogen in a continuous space}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
%\VignetteDepends{ggplot2}
%\VignetteDepends{viridis}
%\VignetteDepends{dplyr}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
Aside from the simple simulation set up, explored in [another tutorial](none.html), where hosts are "not structured", `nosoi` can take into account a population structured either in discrete states or in a continuous space.
We focus here on a continuous space structure (for the discrete structure, see [this tutorial](discrete.html)).
The continuous space is intended to allow the simulation to take place in a geographical context where hosts can move on a map, and may be influenced by heterogeneously distributed variables (environmental for example) on said map.
This tutorial focuses on setting up a `nosoi` simulation for a pathogen which spread occurs in a continuous space.
# Structure of the population
## Continuous space
In the continuous space, the simulation takes place on a map defined by its coordinates. This map is actually a raster, here denoted by `structure.raster`, that also holds an environmental value of interest that may influence the epidemic process. Areas were hosts cannot move (water, inaccessible areas, etc) must be defined by the value `NA`.
Here, we generate a random raster for the purpose of this tutorial, but you can use a "real" world raster and simulate your epidemic on it (see [this example](https://slequime.github.io/nosoi/articles/examples/ebola.html)).
<details>
<summary>Code used to create the random raster used in this tutorial</summary>
```{r RasterCode, eval = FALSE}
library(raster)
set.seed(860)
test.raster <- raster(nrows=100, ncols=100, xmn=-50, xmx=50, ymn=-50,ymx=50)
test.raster[] <- runif(10000, -80, 150)
test.raster <- focal(focal(test.raster, w=matrix(1, 5, 5), mean), w=matrix(1, 5, 5), mean)
```
</details>
```{r RasterCode2, echo = FALSE, message=FALSE}
library(raster)
set.seed(860)
test.raster <- raster(nrows=100, ncols=100, xmn=-50, xmx=50, ymn=-50,ymx=50)
test.raster[] <- runif(10000, -80, 150)
test.raster <- focal(focal(test.raster, w=matrix(1, 5, 5), mean), w=matrix(1, 5, 5), mean)
test.raster_df <- as.data.frame(test.raster, xy = TRUE)
if (!(requireNamespace("ggplot2", quietly = TRUE) &&
requireNamespace("viridis", quietly = TRUE))) {
message("Packages 'ggplot2' and 'viridis' are needed for plotting this figure.")
} else {
library(ggplot2)
library(viridis)
ggplot() +
geom_raster(data=test.raster_df, aes(x=x,y=y,fill=layer)) +
scale_fill_viridis(limits=c(0,NA),na.value="white", name="Environmental\nvalue") +
labs(caption = "Environmental raster") + theme_minimal() + coord_quickmap()
}
```
## Movement kernel
Movements in `nosoi` are basically described by a random walk (Brownian motion), that can be attracted by the underlying environmental variable (`attracted.by.raster`). In any case, the movement cannot land into areas out of the raster nor into areas of the raster with no value (`NA`).
The user provides a core function `sdMove` that gives the standard deviation of this Brownian motion. This function, [like the other core functions](nosoi.html#setting-up-the-core-functions), can be dependent on the time since infection `t`, the absolute time of the simulation `prestime`, the location (value of the environmental variable) or individual-based parameters. This standard deviation is in your coordinate space, so the actual distance traveled by the host may actually vary according to your space.
In the case where hosts are attracted by the raster, hosts will tend to go to positions with high environmental values. Internally, once a new position is proposed, its environmental value is extracted and normalized (against the highest environmental value). This normalized number represents its probability of acceptance. Up to 30 draws can be made until one is accepted (by default the 30$^{th}$ move is always accepted; this value was empirically set to allow enough exploration without being stuck in a time consuming search).
The figure below shows a movement on a simple gradient raster, with or without an attraction (`sdMove` is a constant fixed at 0.01, time is 1000 units of time, maximum value of the environmental variable is 300). The red point represents the starting position.
```{r RW-1, echo=FALSE, message = FALSE}
library(data.table)
library(raster)
x = seq(0,100,len=100)
y = seq(0,100,len=100)
a=1;b=2
r12 = raster(outer(x,y,function(x,y){a*x+b*y}))
test.raster_df <- as.data.frame(r12, xy = TRUE)
max.raster=max(r12[], na.rm=T)
set.seed(95895)
move.test = data.table(t=0,current.in.x=0.25,current.in.y=0.75)
sdMove.values = 0.01
moveRotateContinuous.example = function(pt1, dX, dY, angle)
{
s = sin(angle); c = cos(angle)
x = dX; y = dY
x_new = (x*c)-(y*s); y_new = (x*s)+(y*c)
x_new = x_new+pt1[1]; y_new = y_new+pt1[2]
return(c(x_new,y_new))
}
for (t in 2:1001){
real.t = t-1
current.move.pos.x = move.test[real.t,"current.in.x"]
current.move.pos.y = move.test[real.t,"current.in.y"]
current.sdMove.value = as.numeric(sdMove.values)
positionFound1 = FALSE
while (positionFound1 == FALSE)
{
counter = 0
dX = rnorm(1, 0, current.sdMove.value)
dY = rnorm(1, 0, current.sdMove.value)
positionFound2 = FALSE
while (positionFound2 == FALSE)
{
angle = (2*base::pi)*runif(1)
newP = moveRotateContinuous.example(c(as.numeric(current.move.pos.x),as.numeric(current.move.pos.y)), dX, dY,angle)
temp.env.value = raster::extract(r12,cbind(newP[1],newP[2]))
if (!is.na(temp.env.value)){
if (TRUE) {
counter = counter+1
v2 = temp.env.value/max.raster
if (runif(1,0,1) < v2)
{
move.temp= data.table(t=real.t,current.in.x=newP[1],current.in.y=newP[2])
positionFound2 = TRUE
positionFound1 = TRUE
if (counter == 30) {
positionFound1 = TRUE
positionFound2 = TRUE
}
}
}
}
}
}
move.temp= data.table(t=real.t,current.in.x=newP[1],current.in.y=newP[2])
move.test = rbindlist(list(move.test,move.temp))
}
move.test$type = "Attracted by raster"
move.test2 = data.table(t=0,current.in.x=0.25,current.in.y=0.75)
for (t in 2:1001){
real.t = t-1
current.move.pos.x = move.test2[real.t,"current.in.x"]
current.move.pos.y = move.test2[real.t,"current.in.y"]
current.sdMove.value = as.numeric(sdMove.values)
positionFound1 = FALSE
while (positionFound1 == FALSE)
{
counter = 0
dX = rnorm(1, 0, current.sdMove.value)
dY = rnorm(1, 0, current.sdMove.value)
positionFound2 = FALSE
while (positionFound2 == FALSE)
{
angle = (2*base::pi)*runif(1)
newP = moveRotateContinuous.example(c(as.numeric(current.move.pos.x),as.numeric(current.move.pos.y)), dX, dY,angle)
temp.env.value = raster::extract(r12,cbind(newP[1],newP[2]))
if (!is.na(temp.env.value)){
move.temp= data.table(t=real.t,current.in.x=newP[1],current.in.y=newP[2])
positionFound1 = TRUE
positionFound2 = TRUE
}
}
}
move.test2 = rbindlist(list(move.test2,move.temp))
}
move.test2$type = "Not attracted by raster"
move.test.all = rbindlist(list(move.test,move.test2))
if (!(requireNamespace("ggplot2", quietly = TRUE) &&
requireNamespace("viridis", quietly = TRUE))) {
message("Packages 'ggplot2' and 'viridis' are needed for plotting this figure.")
} else {
ggplot() +
geom_raster(data=test.raster_df, aes(x=x,y=y,fill=layer)) +
# geom_line(data = move.test, mapping = aes(x = current.in.x, y = current.in.y), color = "grey90") +
geom_point(data = move.test.all, aes(x = current.in.x, y = current.in.y,color=t)) +
geom_point(data=data.table(t=0,current.in.x=0.25,current.in.y=0.75), aes(x = current.in.x, y = current.in.y), color = "firebrick1",size=3) +
theme_minimal() + facet_wrap(~type) +
scale_fill_viridis(limits=c(0,NA),na.value="white", name="Environmental\nvalue") +
scale_color_viridis(option="magma",limits=c(0,NA),na.value="white", name="Time")
}
```
# Setting up the simulation
The wrapper function `nosoiSim` takes all the arguments that will be passed down to the simulator, in the case of this tutorial `singleContinuous` (for "single host, continuous space").
We thus start by providing the options `type="single"` and `popStructure="continuous"` to set up the analysis:
```{r setupA, eval = FALSE}
SimulationSingle <- nosoiSim(type="single", popStructure="continuous", ...)
```
This simulation type requires several arguments or options in order to run, namely:
- `length.sim`
- `max.infected`
- `init.individuals`
- `init.structure`
- `structure.raster`
- `pExit` with `param.pExit`, `timeDep.pExit`, `diff.pExit` and `hostCount.pExit`
- `pMove` with `param.pMove`, `timeDep.pMove`, `diff.pMove` and `hostCount.pMove`
- `sdMove` with `param.sdMove`, `timeDep.sdMove`, `diff.sdMove` and `hostCount.sdMove`
- `attracted.by.raster`
- `nContact` with `param.nContact`, `timeDep.nContact`, `diff.nContact` and `hostCount.nContact`
- `pTrans` with `param.pTrans`, `timeDep.pTrans`, `diff.pTrans` and `hostCount.pTrans`
- `prefix.host.A`
- `print.progress`
- `print.step`
All the `param.*` elements provide individual-level parameters to be taken into account, while the `timeDep.*` elements inform the simulator if the "absolute" simulation time should be taken into account.
The `diff.*` elements inform the simulator if there is a differential probability according to the state the host is currently in and the `hostCount.*` elements inform the simulator if the number of host in each state has to be taken into account.
All parameters must be provided, although `timeDep.*`, `diff.*` and `hostCount.*` have default values set to `FALSE`; if you do not want to use these options, then you do not have to explicitly provide a value.
## General parameters
`length.sim` and `max.infected` are general parameters that define the simulation:
- `length.sim` is the maximum number of time units (e.g. days, months, years, or another time unit of choice) during which the simulation will be run.
- `max.infected` is the maximum number of individuals that can be infected during the simulation.
`init.individuals` and `init.structure` are the “seeding parameters”:
- `init.individuals` defines the number of individuals (an integer above 1) that will start a transmission chain. Keep in mind that you will have as many transmission chains as initial individuals, which is equivalent as launching a number of independent nosoi simulations.
- `init.structure` specifies the original location (as a vector of two coordinates, x and y) in the continuous space (has to be the same starting location for all starting individuals). The location provided in `init.structure` should of course be present in the `structure.raster`.
Here, we will run a simulation starting with 1 individual, for a maximum of 1,000 infected individuals and a maximum time of 300 days.
```{r setupB, eval = FALSE}
SimulationSingle <- nosoiSim(type="single", popStructure="continuous",
length.sim=300, max.infected=1000, init.individuals=1, ...)
```
## Core functions
The core functions `pExit`, `nContact`, `pMove`, `sdMove` and `pTrans` each follow the [same principles to be set up](nosoi.html#setting-up-the-core-functions).
To accommodate for different scenarios, they can be constant, time-dependent (using the relative time since infection `t` for each individual or the "absolute" time `pres.time` of the simulation) or even individually parameterized, to include some stochasticity at the individual-host level.
In any case, the provided function, like all other core functions in `nosoi`, has to be expressed as a function of time `t`, even if time is not used to compute the probability.
In case the function uses individual-based parameters, they must be specified in a list of functions (called `param.pExit`, `param.nContact`, `param.pMove`, `param.sdMove` or `param.pTrans`) (see [Get started](nosoi.html#parameters)). If no individual-based parameters are used, then these lists are set to `NA`.
> Keep in mind that `pExit`, `pMove`, and `pTrans` have to return a probability (i.e. a value between 0 and 1) while `nContact` should return a natural number (positive integer or zero) and `sdMove` a positive number.
Several parameters, such as the time since infection, the "absolute" time of the simulation, the location (in the discrete states) and individual-based parameters can be combined within the same function.
> `nosoi` can be flexible in what it allows as parameters in your function, but a common general structure should be observed.
> The argument of the function should be (in that order):
>
> 1. time since infection `t` (compulsory);
> 2. "absolute" time `prestime` (optional);
> 3. current environmental value `current.env.value` (optional);
> 4. host count in state `host.count` (optional);
> 5. other individual-based parameter(s), provided in `param.function`.
>
> If one of the argument is not used (except `t`), then you do not have to provide it and can continue with the next argument.
### `pExit`, `param.pExit`, `timeDep.pExit`, `diff.pExit` and `hostCount.pExit`
- `pExit` is the first required fundamental parameter and provides a daily probability for a host to leave the simulation (either cured, died, etc.).
- `param.pExit` is the list of functions needed to individually parameterize `pExit` (see [Get started](nosoi.html#parameters)). The name of each function in the list has to match the name of the parameter it is sampling for `pExit`.
- `timeDep.pExit` allows for `pExit` to be dependent on the "absolute" time of the simulation, to account - for example - for seasonality or other external time-related covariates. By default, `timeDep.pExit` is set to `FALSE`.
- `diff.pExit` allows `pExit` to differ according to the current discrete state of the host. This can be useful, for example, if one state has a higher mortality rate (or better cures!) for the infection, in that case the probability to exit the simulation is higher. By default, `diff.pExit` is set to `FALSE`. Be careful, every state should give back a result for `pExit`.
- `hostCount.pExit` allows `pExit` to differ according to the number of hosts currently in a state. By default, `hostCount.pExit` is set to `FALSE`. To use `hostCount.pExit`, `diff.pExit` has to be set to `TRUE` too.
### `pMove`, `param.pMove`, `timeDep.pMove`, `diff.pMove` and `hostCount.pMove`
- `pMove` is the probability (per unit of time) for a host to move on the map. It should not be confused with the distance the host will travel linked to `sdMove` (see below).
- `param.pMove` is the list of functions needed to individually parameterize `pMove` (see [Get started](nosoi.html#parameters)). The name of each function in the list has to match the name of the parameter it is sampling for `pMove`.
- `timeDep.pMove` allows for `pMove` to be dependent of the "absolute" time of the simulation, to account, for example, for seasonality or other external time related covariates. By default, `timeDep.pMove` is set to `FALSE`.
- `diff.pMove` allows `pMove` to be different according to the current environmental value of the location where the host is located. By default, `diff.pMove` is set to `FALSE`.
- `hostCount.pMove` allows `pMove` to differ according to the number of hosts currently in a raster cell. By default, `hostCount.pMove` is set to `FALSE`. To use `hostCount.pMove`, `diff.pMove` has to be set to `TRUE` too.
### `sdMove`, `param.sdMove`, `timeDep.sdMove`, `diff.sdMove` and `hostCount.sdMove`
- `sdMove` represents the standard deviation of the Brownian motion movement when a host undergo a move (set by `pMove`).
- `param.sdMove` is the list of function need to individually parameterize `sdMove` (see [Get started](nosoi.html#parameters)). The name of each function in the list has to have the same name as the parameter it is sampling for `sdMove`.
- `timeDep.sdMove` allows for `sdMove` to be dependent of the "absolute" time of the simulation, to account, for example, for seasonality or other external time related covariates. By default, `timeDep.sdMove` is set to `FALSE`.
- `diff.sdMove` allows `sdMove` to be different according to the current environmental value of the location where the host is located. By default, `diff.sdMove` is set to `FALSE`.
- `hostCount.sdMove` allows `sdMove` to differ according to the number of hosts currently in a raster cell. By default, `hostCount.sdMove` is set to `FALSE`. To use `hostCount.sdMove`, `diff.sdMove` has to be set to `TRUE` too.
### `nContact`, `param.nContact`, `timeDep.nContact`, `diff.nContact` and `hostCount.nContact`
- `nContact` represents the number (expressed as a positive integer) of potentially infectious contacts an infected hosts can encounter per unit of time. At each time point, a number of contacts will be determined for each active host in the simulation.
The number of contacts (i.e. the output of your function) has to be an integer and can be set to zero.
- `param.nContact` is the list of functions needed to individually parameterize `nContact` (see [Get started](nosoi.html#parameters)). The name of each function in the list has to match the name of the parameter it is sampling for `nContact`.
- `timeDep.nContact` allows for `nContact` to be dependent on the "absolute" time of the simulation, to account - for example - for seasonality or other external time-related covariates. By default, `timeDep.nContact` is set to `FALSE`.
- `diff.nContact` allows for `nContact` to differ according to the current discrete state of the host. By default, `diff.nContact` is set to `FALSE`.
- `hostCount.nContact` allows `nContact` to differ according to the number of hosts currently in a raster cell. This can be useful to adjust the number of contact to the number of potentially susceptible hosts if the infected population is close to the maximum size of the population in a raster cell. By default, `hostCount.nContact` is set to `FALSE`. To use `hostCount.nContact`, `diff.nContact` has to be set to `TRUE` too.
### `pTrans`, `param.pTrans`, `timeDep.pTrans`,`diff.pTrans` and `hostCount.pTrans`
- `pTrans` is the heart of the transmission process and represents the probability of transmission over time (when a contact occurs).
- `param.pTrans` is the list of functions needed to individually parameterize `pTrans` (see [Get started](nosoi.html#parameters)). The name of each function in the list has to match the name of the parameter it is sampling for `pTrans`.
- `timeDep.pTrans` allows for `pTrans` to be dependent on the "absolute" time of the simulation, to account - for example - for seasonality or other external time-related covariates. By default, `timeDep.pTrans` is set to `FALSE`.
- `diff.pTrans` allows for `pTrans` to be different according to the current discrete state of the host. This can be used to account of different dynamics linked to external factors common within a state, such as temperature for example. By default, `diff.pTrans` is set to `FALSE`.
- `hostCount.pTrans` allows `pTrans` to differ according to the number of hosts currently in a raster cell. By default, `hostCount.pTrans` is set to `FALSE`. To use `hostCount.pTrans`, `diff.pTrans` has to be set to `TRUE` too.
## Miscellaneous
`prefix.host` allows you to define the first character(s) for the hosts' unique ID.
It will be followed by a hyphen and a unique number.
By default, `prefix.host` is "H" for "Host".
`print.progress` allows you to have some information printed on the screen about the simulation as it is running. It will print something every `print.step`. By default, `print.progress` is activated with a `print.step = 10` (you can change this frequency), but you may want to deactivate it by setting `print.progress=FALSE`.
## Dual host
In the case of a dual host simulation, several parameters of the `nosoiSim` will have to be specified for each host type, designated by `A` and `B`. The wrapper function `nosoiSim` will then take all the arguments that will be passed down to the simulator, in the case of this tutorial `dualContinuous` (for "dual host, continuous structure").
We thus start by providing the options `type="dual"` and `popStructure="continuous"` to set up the analysis:
```{r setupA-dual, eval = FALSE}
SimulationDual <- nosoiSim(type="dual", popStructure="continuous", ...)
```
This function takes several arguments or options to be able to run, namely:
- `length.sim`
- `max.infected.A`
- `max.infected.B`
- `init.individuals.A`
- `init.individuals.B`
- `init.structure.A`
- `init.structure.B`
- `structure.raster.A`
- `structure.raster.B`
- `pExit.A` with `param.pExit.A`, `timeDep.pExit.A`, `diff.pExit.A` and `hostCount.pExit.A`
- `pMove.A` with `param.pMove.A`, `timeDep.pMove.A`, `diff.pMove.A` and `hostCount.pMove.A`
- `sdMove.A` with `param.sdMove.A`, `timeDep.sdMove.A`, `diff.sdMove.A` and `hostCount.sdMove.A`
- `attracted.by.raster.A`
- `nContact.A` with `param.nContact.A`, `timeDep.nContact.A`, `diff.nContact.A` and `hostCount.nContact.A`
- `pTrans.A` with `param.pTrans.A`, `timeDep.pTrans.A`, `diff.pTrans.A` and `hostCount.pTrans.A`
- `prefix.host.A`
- `pExit.B` with `param.pExit.B`, `timeDep.pExit.B`, `diff.pExit.B` and `hostCount.pExit.B`
- `pMove.B` with `param.pMove.B`, `timeDep.pMove.B`, `diff.pMove.B` and `hostCount.pMove.B`
- `sdMove.B` with `param.sdMove.B`, `timeDep.sdMove.B`, `diff.sdMove.B` and `hostCount.sdMove.B`
- `attracted.by.raster.B`
- `nContact.B` with `param.nContact.B`, `timeDep.nContact.B`, `diff.nContact.B` and `hostCount.nContact.B`
- `pTrans.B` with `param.pTrans.B`, `timeDep.pTrans.B`, `diff.pTrans.B` and `hostCount.pTrans.B`
- `prefix.host.B`
- `print.progress`
- `print.step`
As you can see, host-type dependent parameters are now designated by the suffix `.A` or `.B`.
Both `max.infected.A` and `max.infected.B` have to be provided in order to set an upper limit on the simulation size. To initiate the simulation, you have to provide at least one starting host, either `A` or `B` in `init.individuals.A` or `init.individuals.B` respectively, as well as a starting position in `init.individuals.A` or `init.individuals.B`, respectively. If you want to start the simulation with one host only, then `init.individuals` of the other can be set to 0 and `init.structure` to `NA`.
A major difference here is that hosts may, or not, share the same `structure.raster`. However, since they exist in the same "world", they should be in the same coordinate system. It is also possible to have a host that does not move. In such case, `pMove` can be set to `NA`, as well as `sdMove`. In such "non-moving" case, a raster should still be provided, as it anchors the hosts into the simulation's world.
Here again, all parameters must be provided for both hosts, although `timeDep`, `diff` and `hostCount` have default values set to `FALSE`; if you do not want to use these options, then you do not have to explicitly provide a value. Be careful to switch `diff` to `TRUE` if you want to use `hostCount`, and remember to provide a result for each state.
# Running `nosoi`
## Single host
We present here a very simple simulation for a single host pathogen.
### pExit
For `pExit`, we choose a probability that only depends on the environmental value of the location where the host currently is. The maximum value of our environmental raster is close to 70, and the higher the environmental value is, the less likely the host is to die:
```{r pExit1, eval = FALSE}
p_Exit_fct <- function(t, current.env.value){
if(current.env.value > 60){p=0.02}
if(current.env.value < 60 && current.env.value > 30){p=0.04}
if(current.env.value < 30){p=0.08}
return(p)
}
```
Taken together, the value for `pExit` according to the environmental value will be the following:
```{r pExit2, echo=FALSE, message=FALSE}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
message("Package 'ggplot2' is needed for plotting this figure.")
} else {
library(ggplot2)
library(dplyr)
p_Exit_fctx <- function(x){
if(x >= 60){p=0.02}
if(x < 60 && x > 30){p=0.04}
if(x <= 30){p=0.08}
return(p)
}
data3=data.frame(v2=seq(from = 0, to = 70, by = 0.2))
data3 = data3 %>% group_by(v2) %>% mutate(p=p_Exit_fctx(v2))
ggplot(data=data3,aes(x=v2,y=p)) + geom_line() + theme_minimal() + labs(x="Environmental value",y="pExit") + ylim(0,0.1)
}
```
Remember that `pExit`, like the other core functions has to be function of `t`, even if `t` is not used. Since `pExit` is dependent on the location, `diff.pMove=TRUE`. However, there is no use of the "absolute" time of the simulation nor individual-based parameters, so `param.pExit=NA`, and `timeDep.pExit=FALSE`.
### pMove
We choose a constant value for `pMove`, namely 0.1, i.e. an infected host has 10% chance to change its location for each unit of time.
```{r pMove1, eval = FALSE}
p_Move_fct <- function(t){return(0.1)}
```
Remember that `pMove`, like the other core functions has to be function of `t`, even if `t` is not used. Since `pMove` is not dependent on the location, `diff.pMove=FALSE`. Similarly, there is no use of the "absolute" time of the simulation nor individual-based parameters, so `param.pMove=NA`, and `timeDep.pMove=FALSE`.
### sdMove
We choose a constant value for `sdMove`, namely 0.25. `sdMove` is the standard deviation of the Brownian motion underlying the movement in `nosoi`.
```{r sdMove1, eval=FALSE}
sd_Move_fct <- function(t){return(0.25)}
```
Remember that `sdMove`, like the other core functions has to be function of `t`, even if `t` is not used. Since `sdMove` is not dependent on the location, `diff.pMove=FALSE`. Similarly, there is no use of the "absolute" time of the simulation nor individual-based parameters, so `param.sdMove=NA`, and `timeDep.sdMove=FALSE`.
### nContact
For `nContact`, we choose a constant function that will draw a value in a normal distribution of *mean* = 0.5 and *sd* = 1, round it, and take its absolute value.
```{r nContact1, eval = FALSE}
n_contact_fct <- function(t){abs(round(rnorm(1, 0.5, 1), 0))}
```
The distribution of `nContact` looks as follows:
```{r nContact2, echo=FALSE}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
message("Package 'ggplot2' is needed for plotting this figure.")
} else {
library(ggplot2)
library(dplyr)
set.seed(900)
data = data.frame(N=abs(round(rnorm(200, 0.5, 1), 0)))
data = data %>% group_by(N) %>% summarise(freq=length(N)/200)
ggplot(data=data, aes(x=as.factor(N), y=freq)) + geom_bar(stat="identity") + theme_minimal() + labs(x="nContact",y="Frequency")
}
```
At each time and for each infected host, `nContact` will be drawn anew. Remember that `nContact`, like the other core functions has to be function of `t`, even if `t` is not used. Since `nContact` is constant here, there is no use of the "absolute" time of the simulation, the location of the host, nor individual-based parameters. So `param.nContact=NA`, `timeDep.nContact=FALSE` and `diff.nContact=FALSE`.
### pTrans
We choose `pTrans` in the form of a threshold function: before a certain amount of time since initial infection, the host does not transmit (incubation time, which we call `t_incub`), and after that time, it will transmit with a certain (constant) probability (which we call `p_max`). This function is dependent of the time since the host's infection `t`.
```{r pTrans1, eval = FALSE}
p_Trans_fct <- function(t, p_max, t_incub){
if(t < t_incub){p=0}
if(t >= t_incub){p=p_max}
return(p)
}
```
Because each host is different (slightly different biotic and abiotic factors), you can expect each host to exhibit differences in the dynamics of infection, and hence the probability of transmission over time. Thus, `t_incub` and `p_max` will be sampled for each host individually according to a certain distribution. `t_incub` will be sampled from a normal distribution of $mean$ = 7 and $sd$ = 1, while `p_max` will be sampled from a beta distribution with shape parameters $\alpha$ = 5 and $\beta$ = 2:
```{r pTrans2, eval = FALSE}
t_incub_fct <- function(x){rnorm(x,mean = 7,sd=1)}
p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)}
```
Note that here `t_incub` and `p_max` are functions of `x` and not `t` (they are not core functions but individual-based parameters), and `x` enters the function as the number of draws to make.
Taken together, the profile for `pTrans` for a subset of 200 individuals in the population will look as follows:
```{r pTrans3, echo=FALSE}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
message("Package 'ggplot2' is needed for plotting this figure.")
} else {
library(ggplot2)
library(dplyr)
set.seed(506)
p_Trans_fct <- function(t, p_max, t_incub){
if(t < t_incub){p=0}
if(t >= t_incub){p=p_max}
return(p)
}
t_incub_fct <- function(x){rnorm(x,mean = 7,sd=1)}
p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)}
data = data.frame(t_incub=t_incub_fct(200),p_max=p_max_fct(200),host=paste0("H-",1:200))
t=c(0:12)
data3=NULL
for(t in 0:15){
data2 = data %>% group_by(host) %>% mutate(proba=p_Trans_fct(t=t,p_max=p_max, t_incub=t_incub))
data2$t = t
data3 = rbind(data3, data2)
}
ggplot(data=data3, aes(x=t, y=proba,group=host)) + geom_line(color="grey60") + theme_minimal() + labs(x="Time since infection (t)",y="pTrans")
}
```
`pTrans` is not dependent on the "absolute" time of the simulation, nor on the hosts location, so `timeDep.pTrans=FALSE` and `diff.pTrans=FALSE`. However, since we make use of individual-based parameters, we have to provide a `param.pTrans` as a list of functions. The name of each element within this list should have the same name that the core function (here `pTrans`) uses as argument, e.g.:
```{r pTrans4, eval = FALSE}
t_incub_fct <- function(x){rnorm(x,mean = 7,sd=1)}
p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)}
param_pTrans = list(p_max=p_max_fct,t_incub=t_incub_fct)
```
### Running
Once `nosoiSim` is set up, you can run the simulation (here the "seed" ensures that you will get the same results as in this tutorial).
```{r setupF}
library(nosoi)
#Raster is test.raster
#Starting position will be
start.pos <- c(0,0) # c(x,y)
#pExit
p_Exit_fct <- function(t, current.env.value){
if(current.env.value > 60){p=0.02}
if(current.env.value < 60 && current.env.value > 30){p=0.04}
if(current.env.value < 30){p=0.08}
return(p)
}
#pMove
p_Move_fct <- function(t){return(0.1)}
#sdMove
sd_Move_fct <- function(t){return(0.25)}
#nContact
n_contact_fct = function(t){abs(round(rnorm(1, 0.5, 1), 0))}
#pTrans
proba <- function(t,p_max,t_incub){
if(t <= t_incub){p=0}
if(t >= t_incub){p=p_max}
return(p)
}
t_incub_fct <- function(x){rnorm(x,mean = 5,sd=1)}
p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)}
param_pTrans = list(p_max=p_max_fct,t_incub=t_incub_fct)
# Starting the simulation ------------------------------------
set.seed(846)
SimulationSingle <- nosoiSim(type="single", popStructure="continuous",
length.sim=300, max.infected=300, init.individuals=1,
init.structure=start.pos,
structure.raster=test.raster,
pExit = p_Exit_fct,
param.pExit = NA,
timeDep.pExit=FALSE,
diff.pExit=TRUE,
pMove = p_Move_fct,
param.pMove = NA,
timeDep.pMove=FALSE,
diff.pMove=FALSE,
sdMove = sd_Move_fct,
param.sdMove = NA,
timeDep.sdMove=FALSE,
diff.sdMove=FALSE,
attracted.by.raster=FALSE,
nContact=n_contact_fct,
param.nContact=NA,
timeDep.nContact=FALSE,
diff.nContact=FALSE,
pTrans = proba,
param.pTrans = list(p_max=p_max_fct,t_incub=t_incub_fct),
timeDep.pTrans=FALSE,
diff.pTrans=FALSE,
prefix.host="H",
print.progress=FALSE,
print.step=10)
```
Once the simulation has finished, it reports the number of time units for which the simulation has run (`r SimulationSingle$total.time`), and the maximum number of infected hosts (`r SimulationSingle$host.info.A$N.infected`).
Note that the simulation has stopped here before reaching `length.sim` as it has crossed the `max.infected` threshold set at 300.
## Dual host
Setting up a dual host simulation is similar to the single host version described above, but each parameter has to be provided for both hosts. Here, we choose for Host A the same parameters as the single / only host above. Host B will have sightly different parameters:
### pExit.B
For `pExit.B`, we choose a value that depends on the "absolute" time of the simulation, for example cyclic climatic conditions (temperature). In that case, the function's arguments should be `t` and `prestime` (the "absolute" time of the simulation), in that order:
```{r pExit1-dual, eval = FALSE}
p_Exit_fctB <- function(t,prestime){(sin(prestime/(2*pi*10))+1)/16} #for a periodic function
```
The values of `pExit.B` across the "absolute time" of the simulation will be the following:
```{r pExit2-dual, echo = FALSE}
p_Exit_fctx <- function(x){(sin(x/(2*pi*10))+1)/16} #for a periodic function
if (!requireNamespace("ggplot2", quietly = TRUE)) {
message("Package 'ggplot2' is needed for plotting this figure.")
} else {
ggplot(data=data.frame(x=0), aes(x=x)) + stat_function(fun=p_Exit_fctx) + theme_minimal() + labs(x="Absolute time (prestime)",y="pExit") + xlim(0,360)
}
```
Since `pExit.B` is dependent on the simulation's time, do not forget to set `timeDep.pExit.B` to `TRUE`. Since there are no individual-based parameters nor is there any influence of the host's location, `param.pExit.B=NA` and `diff.pExit.B=NA`.
### pMove.B
We assume here that the hosts B do not move. `pMove.B` will hence be set to `NA`.
```{r pMove.B, eval = FALSE}
p_Move_fct.B <- NA
```
Since `pMove.B` is not dependent on the location, `diff.pMove.B=FALSE`. Similarly, there is no use of the "absolute" time of the simulation nor individual-based parameters, so `param.pMove.B=NA`, and `timeDep.pMove.B=FALSE`.
### sdMove.B
Since hosts B do not move, `sdMove.B` is irrelevant and will be set to `NA`.
```{r sdMove.B, eval = FALSE}
sd_Move_fct.B <- NA
```
Since `sdMove.B` is not dependent on the location, `diff.sdMove.B=FALSE`. Similarly, there is no use of the "absolute" time of the simulation nor individual-based parameters, so `param.sdMove.B=NA`, and `timeDep.sdMove.B=FALSE`.
### nContact.B
For `nContact.B`, we choose a constant function that will sample a value from a provided list of values, with a certain probability:
```{r nContact1.B, eval = FALSE}
n_contact_fct.B = function(t){sample(c(0,1,2),1,prob=c(0.6,0.3,0.1))}
```
The distribution of `nContact.B` looks as follows:
```{r nContact2.B, echo=FALSE}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
message("Package 'ggplot2' is needed for plotting this figure.")
} else {
library(ggplot2)
library(dplyr)
set.seed(6059)
data = data.frame(N=sample(c(0,1,2),200,replace=TRUE,prob=c(0.6,0.3,0.1)))
data = data %>% group_by(N) %>% summarise(freq=length(N)/200)
ggplot(data=data, aes(x=as.factor(N), y=freq)) + geom_bar(stat="identity") + theme_minimal() + labs(x="nContact.B",y="Frequency")
}
```
At each time and for each infected host, `nContact.B` will be drawn anew. Remember that `nContact.B`, like the other core functions has to be function of `t`, even if `t` is not used. Since `nContact.B` is constant here, there is no use of the "absolute" time of the simulation, the host's location, nor individual-based parameters. So `param.nContact.B=NA`, `timeDep.nContact.B=FALSE` and `diff.nContact.B=FALSE`.
### pTrans.B
We choose `pTrans.B` in the form of a Gaussian function. It will reach its maximum value at a certain time point (mean) after initial infection and will subsequently decrease until it reaches 0.
The function `dnorm` used here will achieve this objective: for each time `t` after infection, its return value will reach its maximum value (< 1) at its mean (here `max.time`, a parameter that will be set for each individual, see below) and then decline back to 0. Its increase and decline is parameterized by the standard deviation `sd` of the `dnorm` function.
```{r pTrans1.B, eval = FALSE}
p_Trans_fct.B <- function(t, max.time){
dnorm(t, mean=max.time, sd=2)*5
}
```
Because each host is different (slightly different biotic and abiotic factors), you can expect each host to exhibit differences in the dynamics of infection, and hence the probability of transmission over time. Thus, `max.time` will be sampled for each host individually according to a certain distribution. `max.time` will be sampled from a normal distribution of parameters $mean$ = 5 and $sd$ = 1:
```{r pTrans2.B, eval = FALSE}
max.time_fct <- function(x){rnorm(x,mean = 5,sd=1)}
```
Note again that here `max.time` is a function of `x` and not `t` (i.e. not a core function but individual-based parameters), and `x` enters the function as the number of draws to make.
Taken together, the profile for `pTrans` for a subset of 200 individuals in the population will look as follow:
```{r pTrans3.B, echo=FALSE}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
message("Package 'ggplot2' is needed for plotting this figure.")
} else {
library(ggplot2)
library(dplyr)
set.seed(6609)
p_Trans_fct <- function(t, max.time){
dnorm(t, mean=max.time, sd=2)*5
}
max.time_fct <- function(x){rnorm(x,mean = 5,sd=1)}
data = data.frame(max.time=max.time_fct(200),host=paste0("H-",1:200))
t=c(0:12)
data3=NULL
for(t in 0:15){
data2 = data %>% group_by(host) %>% mutate(proba=p_Trans_fct(t=t,max.time=max.time))
data2$t = t
data3 = rbind(data3, data2)
}
ggplot(data=data3, aes(x=t, y=proba,group=host)) + geom_line(color="grey60",alpha=0.3) + theme_minimal() + labs(x="Time since infection (t)",y="pTrans")
}
```
Since `pTrans.B` is not dependent of the "absolute" time of the simulation nor the host's location, `timeDep.pTrans.B=FALSE` and `diff.pTrans.B=FALSE`. However, since we have made use of individual-based parameters, we have to provide a `param.pTrans` as a list of functions. The names of each element of the list should have the same name that the core function (`pTrans.B` here) uses as its argument, as shown here:
```{r pTrans4.B, eval = FALSE}
max.time_fct <- function(x){rnorm(x,mean = 5,sd=1)}
param_pTrans.B = list(max.time=max.time_fct)
```
### Running
Once nosoiSim is set up, you can run the simulation (here the "seed" ensures that you will get the same results as in this tutorial).
```{r setupF.B}
library(nosoi)
#Raster is test.raster
#Starting position will be
start.pos <- c(0,0) # c(x,y)
#Host A -----------------------------------
#pExit
p_Exit_fct <- function(t, current.env.value){
if(current.env.value > 60){p=0.02}
if(current.env.value < 60 && current.env.value > 30){p=0.04}
if(current.env.value < 30){p=0.08}
return(p)
}
#pMove
p_Move_fct <- function(t){return(0.1)}
#sdMove
sd_Move_fct <- function(t){return(0.25)}
#nContact
n_contact_fct = function(t){abs(round(rnorm(1, 0.5, 1), 0))}
#pTrans
proba <- function(t,p_max,t_incub){
if(t <= t_incub){p=0}
if(t >= t_incub){p=p_max}
return(p)
}
t_incub_fct <- function(x){rnorm(x,mean = 5,sd=1)}
p_max_fct <- function(x){rbeta(x,shape1 = 5,shape2=2)}
param_pTrans = list(p_max=p_max_fct,t_incub=t_incub_fct)
#Host B -----------------------------------
#pExit
p_Exit_fct.B <- function(t,prestime){(sin(prestime/(2*pi*10))+1)/16}
#pMove
p_Move_fct.B <- NA
#pMove
sd_Move_fct.B <- NA
#nContact
n_contact_fct.B = function(t){sample(c(0,1,2),1,prob=c(0.6,0.3,0.1))}
#pTrans
p_Trans_fct.B <- function(t, max.time){
dnorm(t, mean=max.time, sd=2)*5
}
max.time_fct <- function(x){rnorm(x,mean = 5,sd=1)}
param_pTrans.B = list(max.time=max.time_fct)
# Starting the simulation ------------------------------------
set.seed(60)
SimulationDual <- nosoiSim(type="dual", popStructure="continuous",
length.sim=300,
max.infected.A=100,
max.infected.B=200,
init.individuals.A=1,
init.individuals.B=0,
init.structure.A=start.pos,
init.structure.B=NA,
structure.raster.A=test.raster,
structure.raster.B=test.raster,
pExit.A = p_Exit_fct,
param.pExit.A = NA,
timeDep.pExit.A=FALSE,
diff.pExit.A=TRUE,
pMove.A = p_Move_fct,
param.pMove.A = NA,
timeDep.pMove.A=FALSE,
diff.pMove.A=FALSE,
sdMove.A = sd_Move_fct,
param.sdMove.A = NA,
timeDep.sdMove.A=FALSE,
diff.sdMove.A=FALSE,
attracted.by.raster.A=FALSE,
nContact.A=n_contact_fct,
param.nContact.A=NA,
timeDep.nContact.A=FALSE,
diff.nContact.A=FALSE,
pTrans.A = proba,
param.pTrans.A = list(p_max=p_max_fct,t_incub=t_incub_fct),
timeDep.pTrans.A=FALSE,
diff.pTrans.A=FALSE,
prefix.host.A="H",
pExit.B = p_Exit_fct.B,
param.pExit.B = NA,
timeDep.pExit.B=TRUE,
diff.pExit.B=FALSE,
pMove.B = p_Move_fct.B,
param.pMove.B = NA,
timeDep.pMove.B=FALSE,
diff.pMove.B=FALSE,
sdMove.B = sd_Move_fct.B,
param.sdMove.B = NA,
timeDep.sdMove.B=FALSE,
diff.sdMove.B=FALSE,
attracted.by.raster.B=FALSE,
nContact.B=n_contact_fct.B,
param.nContact.B=NA,
timeDep.nContact.B=FALSE,
diff.nContact.B=FALSE,
pTrans.B = p_Trans_fct.B,
param.pTrans.B = param_pTrans.B,
timeDep.pTrans.B=FALSE,
diff.pTrans.B=FALSE,
prefix.host.B="V",
print.progress=FALSE)
```
Once the simulation has finished, it reports the number of time units for which the simulation has run (`r SimulationDual$total.time`), and the maximum number of infected hosts A (`r SimulationDual$host.info.A$N.infected`) and hosts B (`r SimulationDual$host.info.B$N.infected`).
Note that the simulation has stopped here before reaching `length.sim` as it has crossed the `max.infected.A` threshold set at 100.
# Going further
To analyze and visualize your `nosoi` simulation output, you can have a look on [this page](https://slequime.github.io/nosoi/articles/examples/viz.html).
A practical example using a dual host type of simulation with population structure in a continuous space is also available:
- [Spread of Ebola virus in a continuous space](https://slequime.github.io/nosoi/articles/examples/ebola.html).