-
Notifications
You must be signed in to change notification settings - Fork 1
/
IBMPopSim_human_pop_IMD.Rmd
633 lines (448 loc) · 20.6 KB
/
IBMPopSim_human_pop_IMD.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
---
title: "Human population with swap"
author: "Daphné Giorgi, Sarah Kaakai, Vincent Lemaire"
pkgdown:
as_is: true
output:
bookdown::pdf_document2:
toc: yes
toc_depth: 3
number_sections: yes
package: IBMPopSim
vignette: >
%\VignetteIndexEntry{Human population with swap}
%\VignetteEncoding{UTF-8}
\usepackage[utf8]{inputenc}
%\VignetteEngine{knitr::rmarkdown}
---
This vignette provides an example of usage of the package `IBMPopSim`, for simulating an heterogeneous human population in which individuals can change of characteristics over their life course. Since this is an example of advanced usage of `IBMPopSim`, it is recommended to start by reading the package vignettes `vignette('IBMPopSim')` and `vignette('IBMPopSim_human_pop')`.
The population is based on data from England and Wales (EW) population, and individuals are characterized by the so-called Index of Multiple Deprivation (IMD), a deprivation index based on the place of living ( see [here](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/464430/English_Index_of_Multiple_Deprivation_2015_-_Guidance.pdf) for more details).
A toy model is simulated, where the individuals' demographic rates depend on their IMD. Due to internal migrations, individuals can change of IMD over their life course (swap events).
# Example description
The population is structured by age, gender and IMD. The population is distributed into five IMD quintiles. Group 1 corresponds to the least deprived subpopulation, and group 5 to the most deprived subpopulation.
Death and birth intensities are constant over time, but depend on the individuals age, gender and IMD. For instance, individuals in group 5 have a higher death intensity than individuals in group 1.
Individuals inherit the same IMD than their parent at birth, but can change of IMD over time, due to internal migration. In this toy model, we assume than younger individuals around 20 are more likely to move to a more deprived neighborhood, while individuals in the age class 30-45 are more likely to move to a less deprived neighborhood.
# Population creation
The initial population is a 100 000 individuals population sampled from England and Wales' 2014 age pyramid, structured by single year of age, gender and IMD quintile (source: Office for National Statistics).
```r
str(EW_popIMD_14)
## List of 3
## $ age_pyramid:'data.frame': 1160 obs. of 4 variables:
## ..$ age : Factor w/ 116 levels "0 - 1","1 - 2",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ IMD : int [1:1160] 1 2 3 4 5 1 2 3 4 5 ...
## ..$ male : logi [1:1160] FALSE FALSE FALSE FALSE FALSE TRUE ...
## ..$ value: num [1:1160] 49114 54293 61541 73289 85626 ...
## $ death_rates:List of 2
## ..$ male :'data.frame': 455 obs. of 3 variables:
## .. ..$ age : int [1:455] 0 1 2 3 4 5 6 7 8 9 ...
## .. ..$ IMD : num [1:455] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ value: num [1:455] 2.59e-03 3.13e-04 1.02e-04 1.65e-05 1.46e-04 ...
## ..$ female:'data.frame': 455 obs. of 3 variables:
## .. ..$ age : int [1:455] 0 1 2 3 4 5 6 7 8 9 ...
## .. ..$ IMD : num [1:455] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ value: num [1:455] 2.87e-03 2.52e-04 1.43e-04 8.73e-05 1.20e-04 ...
## $ sample :'data.frame': 100000 obs. of 4 variables:
## ..$ birth: num [1:100000] -0.0917 -0.6313 -0.5008 -0.4568 -0.6978 ...
## ..$ death: num [1:100000] NA NA NA NA NA NA NA NA NA NA ...
## ..$ male : logi [1:100000] FALSE TRUE TRUE FALSE TRUE FALSE ...
## ..$ IMD : int [1:100000] 5 2 5 4 3 4 2 1 5 4 ...
```
When creating a `?population`, internal functions proceed to several checks, which include verifications on appropriate column names and types.
```r
pop_init <- population(EW_popIMD_14$sample)
```
```r
head(pop_init)
## birth death male IMD
## 1 -0.09170591 NA FALSE 5
## 2 -0.63126956 NA TRUE 2
## 3 -0.50078693 NA TRUE 5
## 4 -0.45684987 NA FALSE 4
## 5 -0.69782311 NA TRUE 3
## 6 -0.04470658 NA FALSE 4
```
# Events description
There are 3 possible types of events :
- Birth
- Death
- Swap (internal migrations)
Each event is characterized by its intensity and its kernel, as follows.
## Death events
For each gender $\epsilon$ and IMD subgroup $i=1..5$, we define a step function
$$d_i^\epsilon (a), \quad a=0,..a_{max}$$
defining the death intensity of an individual of age $a$, gender $\epsilon$ and subgroup $i$.
### Parameters
Death intensities are based on England and Wales' 2014 age specific death rates by gender and IMD (source: Office for National Statistics).
Death rates are lower in less deprived quintiles.
<div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_death_rate-1.png")
```
</div>
**Step functions creation**
The death intensity functions for each IMD and gender are defined as parameters of the model, as two lists of R step functions.
These parameters will be transformed during the model creation into a vector of C++ step functions (starting from index 0).
```r
params_death = with(EW_popIMD_14$death_rates,
list(
"death_male"=lapply(1:5, function(i) stepfun(x= subset(male, IMD==i)[,"age"],
y= c(0,subset(male, IMD==i)[,"value"]))),
"death_female"=lapply(1:5, function(i) stepfun(x= subset(female, IMD==i)[,"age"],
y= c(0,subset(female, IMD==i)[,"value"])))
)
)
```
C++ code chunk implementing the intensity of a death event:
```r
intensity_code_death <- '
if (I.male)
result = death_male[(I.IMD-1)](age(I,t));
else
result = death_female[(I.IMD-1)](age(I,t));
'
```
**Creation of the event**
By default, the name of an event of type "death" is "death".
```r
death_event <- mk_event_individual(
type = "death",
intensity_code = intensity_code_death
)
```
## Birth events
In this toy model, only women give birth, at a Weibull shaped intensity $b_i(a)$ depending on their age $a$ and IMD subgroup $i$,
$$b_i(a)=TFR_i\frac{\beta_i}{\alpha_i}(\frac{(a-\bar{a})}{\alpha_i})^{\beta_i-1}\exp((-\frac{a-\bar{a}}{\alpha_i})^{\beta_i}).$$
These functions can be implemented by using the `IBMPopSim` function `weibull(k,c)`, which creates an R function corresponding the Weibull density function with parameters $(k,c)$, and which is translated into a C++ function during the model creation.
By default, the newborn inherits the IMD of his parent.
### Parameters
We consider that women in the IMD subgroup 1 and 2 (resp. 4 and 5) have the same birth intensity functions. The parameters for birth events are thus composed of 10 parameters:
- $\bar{a}$
- $\alpha=(\alpha_1,\alpha_2,\alpha_3)$
- $\beta=(\beta_1,\beta_2,\beta_3)$
- $TFR = (TFR_1,TFR_2, TFR_3)$.
Examples of parameters values are available in `toy_params$birth`.
```r
# birth_sex_ratio = 1.05
params_birth <- with(toy_params$birth,
list(
"TFR_weights" = TFR_weights,
"a_mean"= 15,
"birth" = list(
weibull(beta[1], alpha[1]), # Weibull functions creation
weibull(beta[2], alpha[2]),
weibull(beta[3], alpha[3])),
"p_male" = 0.51 # probability to give birth to a male
)
)
```
<div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_birth_intensities-1.png")
```
</div>
### Event creation
We give some C++ code implementing the intensity of a birth event. The lists `params_birth$birth` and `params_birth$TFR_weights` are internally transformed into C++ vectors (index starting at 0).
```r
birth_intensity_code <- '
if (I.male) result = 0.;
else {
if (I.IMD <= 2) result = TFR_weights[0] * birth[0](age(I,t)-a_mean);
if (I.IMD == 3) result = TFR_weights[1] * birth[1](age(I,t)-a_mean);
if (I.IMD >= 4) result = TFR_weights[2] * birth[2](age(I,t)-a_mean);
}
'
```
Birth event creation
```r
birth_event <- mk_event_individual(
type = "birth",
intensity_code = birth_intensity_code,
kernel_code = "newI.male = CUnif(0, 1) < p_male;" # Choice of gender for newborn
)
```
## Swap event
Individuals can move during their lifetime, and thus change of IMD subgroup (swap events).
We assume that an individual can change of IMD subgroup with an intensity depending on his age and IMD subgroup.
Young individuals in the age class $[15,30]$ and in the less deprived IMD quintiles (1 and 2) can move to more deprived areas, for instance for studying. On the other hand, older individuals in the age class $[30,45]$ and $[0,15]$, and in deprived areas can move to a less deprived area, modeling for instance family creation, moving out to less deprived areas.
Age-specific swap intensities are given by 5 step functions, one for each IMD subgroup. When a swap event occur, the new IMD subgroup of the individual is determined by a discrete random variable depending on his age.
### Parameters
Example parameters values are given in `toy_params$swap`.
Swap intensity functions are saved as model parameters as a list of step functions, which will be transformed into a C++ vector of functions when the model is created.
The vectors of discrete probabilities determining the action of the event when a swap occur is saved in a matrix, which will be transformed into a Rcpp Armadillo matrix.
Note that data frames are not accepted as model parameters.
```r
params_swap <- with(toy_params$swap,
list(
"swap_intensities" = apply(intensities, 2, function(rates) stepfun(x=ages,
y=rates)),
"swap_distribution" = as.matrix(distribution),
"swap_age_to_idx" = stepfun(ages, seq(0,3))
)
)
```
```r
df_intensities <- data.frame("age"=c(0,15,30,45),toy_params$swap$intensities)
```
<div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_df-1.png")
```
</div>
C++ code of the intensity of a swap event, using the model parameters:
```r
intensity_code_swap <- '
result = swap_intensities[I.IMD-1](age(I,t));
'
```
C++ **kernel code** of a swap event
If a swap event occurs, the new IMD subgroup of the individual is determined by a discrete random variable, with probability distribution given by a row of `params_swap$swap_distribution`, depending on his age.
For instance, an individual of age in $[0,15]$ has a probability of 0.6 to move on IMD subgroup 1 and 0.4 to move to IMD subgroup 2.
Discrete random variables can be drawn in `IBMPopSim` by calling the function `Cdiscrete` in the kernel code, with an `Armadillo` vector or matrix (see Section 3 of `vignette('IBMPopSim')`).
```r
params_swap$swap_distribution
## IMD1 IMD2 IMD3 IMD4 IMD5
## [1,] 0.6 0.4 0.0 0.0 0.0
## [2,] 0.0 0.0 0.4 0.3 0.3
## [3,] 0.6 0.4 0.0 0.0 0.0
```
```r
kernel_code_swap <- '
int idx = swap_age_to_idx(age(I,t)); // variables must by typed in C++
I.IMD = CDiscrete(swap_distribution.begin_row(idx),
swap_distribution.end_row(idx)) + 1;
'
```
Swap event creation:
```r
swap_event <- mk_event_individual(
type = "swap",
intensity_code = intensity_code_swap,
kernel_code = kernel_code_swap
)
```
# Model creation (without id)
The model is created by calling the function `?mk_model` with arguments:
- The individuals' characteristics.
- The events list.
- The vector of model parameters (only parameters name and types are saved).
```r
params = c(params_birth, params_death, params_swap)
```
```r
model <- mk_model(
characteristics = get_characteristics(pop_init),
events = list(birth_event, death_event, swap_event),
parameters = params)
## Warning in compatibility_chars_events(characteristics, events): The list of events contains a 'swap' event and there is no 'id' in the characteristics.
## Add 'id' to the characteristics if tracking changes along time is desired.
summary(model)
## Events description:
## [[1]]
## Event class : individual
## Event type : birth
## Event name : birth
## Intensity code : '
## if (I.male) result = 0.;
## else {
## if (I.IMD <= 2) result = TFR_weights[0] * birth[0](age(I,t)-a_mean);
## if (I.IMD == 3) result = TFR_weights[1] * birth[1](age(I,t)-a_mean);
## if (I.IMD >= 4) result = TFR_weights[2] * birth[2](age(I,t)-a_mean);
## }
## '
## Kernel code : 'newI.male = CUnif(0, 1) < p_male;'
## [[2]]
## Event class : individual
## Event type : death
## Event name : death
## Intensity code : '
## if (I.male)
## result = death_male[(I.IMD-1)](age(I,t));
## else
## result = death_female[(I.IMD-1)](age(I,t));
## '
## Kernel code : ''
## [[3]]
## Event class : individual
## Event type : swap
## Event name : swap
## Intensity code : '
## result = swap_intensities[I.IMD-1](age(I,t));
## '
## Kernel code : '
## int idx = swap_age_to_idx(age(I,t)); // variables must by typed in C++
## I.IMD = CDiscrete(swap_distribution.begin_row(idx),
## swap_distribution.end_row(idx)) + 1;
## '
##
## ---------------------------------------
## Individual description:
## names: birth death male IMD
## R types: double double logical integer
## C types: double double bool int
## ---------------------------------------
## R parameters available in C++ code:
## names: TFR_weights a_mean birth p_male death_male death_female swap_intensities swap_distribution swap_age_to_idx
## R types: vector double list double list list list matrix closure
## C types: arma::vec double list_of_function_x double list_of_function_x list_of_function_x list_of_function_x arma::mat function_x
```
# Simulation
## Events bounds
The first step before simulating the model is to compute bounds for each event intensity.
**Birth intensity bound**
```r
E <-c(0,50)
birth_max <- with(params,
max(sapply(1:3, function(i)(
TFR_weights[i]* optimize(f=birth[[i]], interval=E,
maximum=TRUE)$objective)))
)
```
**Death intensity bound**
The operator `?max` has been overloaded in `IBMPopSim` and can be applied to step functions.
```r
death_max <- max(sapply(params$death_male, max),
sapply(params$death_female, max))
```
**Swap intensity bound**
```r
swap_max <- max(sapply(params$swap_intensities, max))
```
## Simulation
Recording all swap events is computationally expensive. To overcome this difficulty, the simulation in the presence of swap events returns a list of populations, composed of a "picture" of the population at all times in the vector argument `time` of `popsim` (see `vignette('IBMPopSim')` for more details).
The first component of the variable `t_sim` below is the initial time. The simulation returns `n-1` populations representing the population at $t\_sim[1],..,t\_sim[n]$. The population corresponding to time $t\_sim[i]$ is composed of all individuals who lived in the population before $t\_sim[i]$, with their characteristics at time $t\_sim[i]$.
The `events_bounds` names must correspond to the events names.
```r
t_sim <- 0:10
n <- length(t_sim)
```
```r
sim_out <- popsim(model,
initial_population = pop_init,
events_bounds = c('birth' = birth_max, 'swap' = swap_max, 'death' = death_max),
parameters = params,
time = t_sim,
multithreading=TRUE # Set to T since there are no interactions
)
```
# Outputs
Swap events modify the population composition, by increasing the proportion of individuals in the less deprived subgroups, especially for younger age groups.
## Initial and final age pyramids with different IMD subgroups
```r
pyr_init<- age_pyramids(population(EW_popIMD_14$sample),time = 0, ages = c(0:100,130))
pyr_init$group_name <- with(pyr_init, ifelse(male, paste('Males - IMD', IMD),
paste('Females - IMD', IMD)))
```
Final age pyramid
```r
pops_out <- sim_out$population
```
```r
pyr_IMD <- age_pyramid(pops_out[[50]], time = 50,ages = c(0:100,130))
## Error in pops_out[[50]]: subscript out of bounds
pyr_IMD$group_name <- with(pyr_IMD, ifelse(male, paste('Males - IMD', IMD),
paste('Females - IMD', IMD)))
## Error in eval(expr, envir, enclos): object 'pyr_IMD' not found
```
```r
colors <- c(sequential_hcl(n=5, palette = "Magenta"),
sequential_hcl(n=5, palette = "Teal"))
names(colors) <- c(paste('Females - IMD', 1:5),
paste('Males - IMD', 1:5))
```
```r
plot(pyr_init,colors)
```
<div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_pyramids-1.png")
```
</div>
```r
plot(pyr_IMD,colors)
## Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'plot': object 'pyr_IMD' not found
```
<!--```{remark}-->
The high number of individuals aged over 100 is due to the fact that mortality rates are assumed to be constant at ages over 90, which is of course not realistic for human populations (see `vignette('IBMPopSim_human_pop')` for another choice of mortality rates).
<!--```-->
## Evolution of population composition by age group
The plots below illustrates the evolution of the female population composition for different age groups over time.
```r
age_grp <- seq(30,95,15)
age_pyrs <- lapply(1:(n-1), function (i)(age_pyramid(pops_out[[i]],t_sim[i+1],age_grp)))
age_pyr_fem <- filter(bind_rows(age_pyrs,.id="time"), male==FALSE)
compo_pop_fem <- age_pyr_fem %>%
group_by(age,time) %>%
mutate(composition = value/sum(value))
```
<div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_compo_fem-1.png")
```
</div>
# Different simulations with the same model
The model parameters and initial population can be modified without having to recompile the model.
In particular, events can be deactivated by setting the event bound to 0.
## Simulation without swap events
```r
sim_out_noswp <- popsim(model,
initial_population = pop_init,
events_bounds = c('birth' = birth_max, 'swap' = 0, 'death' = death_max),
# Swap events deactivated
parameters = params,
time = t_sim,
multithreading=TRUE)
## [1] "event swap is deactivated"
```
**Comparison of population evolution with and without swap events**
```r
age_pyrs_nosw <- lapply(1:(n-1), function (i)(age_pyramid(sim_out_noswp$population[[i]],
t_sim[i+1],age_grp)))
compo_pop_fem_nosw <- filter(bind_rows(age_pyrs_nosw,.id="time"), male==FALSE) %>%
group_by(age,time) %>% mutate(perc=value/sum(value))
```
<div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_compo_fem2-1.png")
```
</div><div class="figure" style="text-align: center">
```{r, echo=FALSE, fig.align='center', out.width = '70%'}
knitr::include_graphics("hpIMD_compo_fem2-2.png")
```
</div>
## Simulation of individual life courses
In the presence of swap events, individual trajectories can be isolated by attributing a unique id to each individuals in the population. This is done during the population creation by setting the optional argument `id` to `TRUE`.
```r
pop_init <- population(pop_init, id=TRUE)
```
```r
model_id <- mk_model(
characteristics = get_characteristics(pop_init),
events = list(birth_event, death_event, swap_event),
parameters = params
)
```
```r
sim_out_id <- popsim(model_id,
initial_population = pop_init,
events_bounds = c('birth' = birth_max, 'swap' = swap_max, 'death' = death_max),
parameters = params,
time = t_sim,
multithreading=TRUE)
```
When each individual has been attributed a unique id, a population summarizing each life course can be obtained from the simulation output `sim_out_id$population`(list of populations) by calling the function `merge_pop_withid`.
Characteristics to be tracked over time must be specified in the argument `chars_tracked`.
```r
pop_id <- merge_pop_withid(sim_out_id$population, chars_tracked = "IMD")
```
`pop_id` is a `population` in which each line corresponds to an individual and include his id, birth date, death date, gender and IMD subgroup at each discretization time (components of $t_{sim}$).
\footnotesize
```r
head(pop_id)
## id birth death male IMD_1 IMD_2 IMD_3 IMD_4 IMD_5 IMD_6 IMD_7 IMD_8 IMD_9 IMD_10
## 1 1 -0.09170591 NA FALSE 5 5 5 5 5 5 5 5 5 5
## 2 990 -0.70453761 NA TRUE 5 4 1 5 3 4 5 3 1 1
## 3 5028 -3.54821274 NA TRUE 4 2 2 5 3 2 4 3 4 5
## 4 28431 -23.26734155 NA FALSE 5 5 1 1 4 5 1 4 1 4
## 5 31998 -26.33175121 NA FALSE 3 5 4 5 4 5 3 3 5 1
## 6 30928 -25.01623313 NA FALSE 4 2 5 2 3 4 1 3 4 5
```