-
Notifications
You must be signed in to change notification settings - Fork 4
/
advanced.Rmd
704 lines (554 loc) · 34.1 KB
/
advanced.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
---
title: "rbmi: Advanced Functionality"
author: Alessandro Noci, Craig Gower-Page, and Marcel Wolbers
output:
bookdown::html_document2:
toc: true
toc_depth: 4
number_sections: true
citation_package: natbib
base_format: rmarkdown::html_vignette
bibliography: "references.bib"
link-citations: true
pkgdown:
as_is: true
linkcolor: blue
vignette: >
%\VignetteIndexEntry{rbmi: Advanced Functionality}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction
The purpose of this vignette is to provide an overview of some more advanced features of the `rbmi` package.
The sections of the vignette are relatively self-contained, i.e. readers should be able to jump directly to the section which covers the functionality that they are most interested in.
# Data simulation using function `simulate_data()` {#sec:dataSimul}
In order to demonstrate the advanced functions we will first create a simulated dataset with the `rbmi` function `simulate_data()`.
The `simulate_data()` function generates data from a randomized clinical trial with longitudinal continuous outcomes and up to two different types of intercurrent events (ICEs).
One intercurrent event (ICE1) may be thought of as a discontinuation from study treatment due to study drug or condition related (SDCR) reasons.
The other event (ICE2) may be thought of as discontinuation from study treatment due to not study drug or condition related (NSDCR) reasons.
For the purpose of this vignette, we simulate data similarly to the simulation study reported in @Wolbers2021 (though we change some of the simulation parameters) and include only one ICE type (ICE1).
Specifically, we simulate a 1:1 randomized trial of an active drug (intervention) versus placebo (control) with 100 subjects per group and 6 post-baseline assessments (bi-monthly visits until 12 months) under the following assumptions:
- The mean outcome trajectory in the placebo group increases linearly from 50 at baseline (visit 0) to 60 at visit 6, i.e. the slope is 10 points/year.
- The mean outcome trajectory in the intervention group is identical to the placebo group up to visit 2. From visit 2 onward, the slope decreases by 50% to 5 points/year.
- The covariance structure of the baseline and follow-up values in both groups is implied by a random intercept and slope model with a standard deviation of 5 for both the intercept and the slope, and a correlation of 0.25. In addition, an independent residual error with standard deviation 2.5 is added to each assessment.
- The probability of study drug discontinuation after each visit is calculated according to a logistic model which depends on the observed outcome at that visit. Specifically, a visit-wise discontinuation probability of 2% and 3% in the control and intervention group, respectively, is specified in case the observed outcome is equal to 50 (the mean value at baseline). The odds of a discontinuation is simulated to increase by +10% for each +1 point increase of the observed outcome.
- Study drug discontinuation is simulated to have no effect on the mean trajectory in the placebo group. In the intervention group, subjects who discontinue follow the slope of the mean trajectory from the placebo group from that time point onward. This is compatible with a copy increments in reference (CIR) assumption.
- Study drop-out at the study drug discontinuation visit occurs with a probability of 50% leading to missing outcome data from that time point onward.
The function `simulate_data()` requires 3 arguments (see the function documentation `help(simulate_data)` for more details):
- `pars_c`: The simulation parameters of the control group
- `pars_t`: The simulation parameters of the intervention group
- `post_ice1_traj`: Specifies how observed outcomes after ICE1 are simulated
Below, we report how data according to the specifications above can be simulated with function `simulate_data()`:
```{r message=FALSE}
library(rbmi)
library(dplyr)
library(ggplot2)
library(purrr)
set.seed(122)
n <- 100
time <- c(0, 2, 4, 6, 8, 10, 12)
# Mean trajectory control
muC <- c(50.0, 51.66667, 53.33333, 55.0, 56.66667, 58.33333, 60.0)
# Mean trajectory intervention
muT <- c(50.0, 51.66667, 53.33333, 54.16667, 55.0, 55.83333, 56.66667)
# Create Sigma
sd_error <- 2.5
covRE <- rbind(
c(25.0, 6.25),
c(6.25, 25.0)
)
Sigma <- cbind(1, time / 12) %*% covRE %*% rbind(1, time / 12) + diag(sd_error^2, nrow = length(time))
# Set probability of discontinuation
probDisc_C <- 0.02
probDisc_T <- 0.03
or_outcome <- 1.10 # +1 point increase => +10% odds of discontinuation
# Set drop-out rate following discontinuation
prob_dropout <- 0.5
# Set simulation parameters of the control group
parsC <- set_simul_pars(
mu = muC,
sigma = Sigma,
n = n,
prob_ice1 = probDisc_C,
or_outcome_ice1 = or_outcome,
prob_post_ice1_dropout = prob_dropout
)
# Set simulation parameters of the intervention group
parsT <- parsC
parsT$mu <- muT
parsT$prob_ice1 <- probDisc_T
# Set assumption about post-ice trajectory
post_ice_traj <- "CIR"
# Simulate data
data <- simulate_data(
pars_c = parsC,
pars_t = parsT,
post_ice1_traj = post_ice_traj
)
head(data)
# As a simple descriptive of the simulated data, summarize the number of subjects with ICEs and missing data
data %>%
group_by(id) %>%
summarise(
group = group[1],
any_ICE = (any(ind_ice1 == 1)),
any_NA = any(is.na(outcome))) %>%
group_by(group) %>%
summarise(
subjects_with_ICE = sum(any_ICE),
subjects_with_missings = sum(any_NA)
)
```
# Handling of observed post-ICE data in `rbmi` under reference-based imputation {#sec:postICEobs}
`rbmi` always uses all non-missing outcome data from the input data set, i.e. such data are never overwritten during the imputation step or removed from the analysis step. This implies that if there are data which are considered to be irrelevant for treatment effect estimation (e.g. data after an ICE for which the estimand specified a hypothetical strategy), then such data need to be removed from the input data set by the user prior to calling the `rbmi` functions.
For imputation under a missing at random (`MAR`) strategy, all observed outcome data is also included in the fitting of the base imputation model. However, for ICEs handled using reference-based imputation methods (such as `CIR`, `CR`, and `JR`), `rbmi` excludes observed post-ICE data from the base imputation model. If these data were not excluded, then the base imputation model would mistakenly estimate mean trajectories based on a mixture of observed pre- and post-ICE data which are not relevant for reference-based imputations. However, any observed post-ICE data are added back into the data set after the fitting of the base imputation model and included as is in the subsequent imputation and analysis steps.
Post-ICE data in the control or reference group are also excluded from the base imputation model if the user specifies a reference-based imputation strategy for such ICEs. This ensures that an ICE has the same impact on the data included in the base imputation model regardless whether the ICE occurred in the control or the intervention group. On the other hand, imputation in the reference group is based on a MAR assumption even for reference-based imputation methods and it may be preferable in some settings to include such post-ICE data from the control group in the base imputation model. This can be implemented by specifying a `MAR` strategy for the ICE in the control group and a reference-based strategy for the same ICE in the intervention group. We will use this latter approach in our example below.
The simulated trial data from section \@ref(sec:dataSimul) assumed that outcomes in the intervention group observed after the ICE "treatment discontinuation" follow the increments observed in the control group. Thus the imputation of missing data in the intervention group after treatment discontinuation might be performed under a reference-based copy increments in reference (`CIR`) assumption.
Specifically, we implement an estimator under the following assumptions:
- The endpoint of interest is the change in the outcome from baseline at each visit.
- The imputation model includes the treatment group, the (categorical) visit, treatment-by-visit interactions, the baseline outcome, and baseline outcome-by-visit interactions as covariates.
- The imputation model assumes a common unstructured covariance matrix in both treatment groups
- In the control group, all missing data are imputed under MAR whereas in the intervention group, missing post-ICE data are imputed under a CIR assumption
- The analysis model of the endpoint in the imputed datasets is a separate ANCOVA model for each visit with the treatment group as the primary covariate and adjustment for the baseline outcome value.
For illustration purposes, we chose MI based on approximate Bayesian posterior draws with 20 random imputations which is not very demanding from a computational perspective. In practical applications, the number of random imputations may need to be increased. Moreover, other imputations are also supported in `rbmi`. For guidance regarding the choice of the imputation approach, we refer the user to a comparison between all implemented approaches in Section 3.9 of the "Statistical Specifications" vignette (`vignette("stat_specs", package = "rbmi")`).
We first report the code to set the variables of the imputation and analysis models. If you are not yet familiar with the syntax, we recommend that you first check the "quickstart" vignette (`vignette("quickstart", package = "rbmi")`).
```{r}
# Create data_ice including the subject's first visit affected by the ICE and the imputation strategy
# Imputation strategy for post-ICE data is CIR in the intervention group and MAR for the control group
# (note that ICEs which are handled using MAR are optional and do not impact the analysis
# because imputation of missing data under MAR is the default)
data_ice_CIR <- data %>%
group_by(id) %>%
filter(ind_ice1 == 1) %>% # select visits with ICEs
mutate(strategy = ifelse(group == "Intervention", "CIR", "MAR")) %>%
summarise(
visit = visit[1], # Select first visit affected by the ICE
strategy = strategy[1]
)
# Compute endpoint of interest: change from baseline and
# remove rows corresponding to baseline visits
data <- data %>%
filter(visit != 0) %>%
mutate(
change = outcome - outcome_bl,
visit = factor(visit, levels = unique(visit))
)
# Define key variables for the imputation and analysis models
vars <- set_vars(
subjid = "id",
visit = "visit",
outcome = "change",
group = "group",
covariates = c("visit*outcome_bl", "visit*group"),
strategy = "strategy"
)
vars_an <- vars
vars_an$covariates <- "outcome_bl"
```
The chosen imputation method can be set with the function `method_approxbayes()` as follows:
```{r}
method <- method_approxbayes(n_sample = 20)
```
We can now sequentially call the 4 key functions of `rbmi` to perform the multiple imputation. Please note that the management of observed post-ICE data is performed without additional complexity for the user. `draws()` automatically excludes post-ICE data handled with a reference-based method (but keeps post-ICE data handled using MAR) using information provided by the argument `data_ice`. `impute()` will impute only truly missing data in `data[[vars$outcome]]`.
```{r}
draw_obj <- draws(
data = data,
data_ice = data_ice_CIR,
vars = vars,
method = method,
quiet = TRUE,
ncores = 2
)
impute_obj_CIR <- impute(
draw_obj,
references = c("Control" = "Control", "Intervention" = "Control")
)
ana_obj_CIR <- analyse(
impute_obj_CIR,
vars = vars_an
)
pool_obj_CIR <- pool(ana_obj_CIR)
pool_obj_CIR
```
This last output gives an estimated difference of
`r paste(formatC(pool_obj_CIR$pars$trt_6$est,format="f",digits=3)," (95% CI ",
formatC(pool_obj_CIR$pars$trt_6$ci[1],format="f",digits=3), " to ",
formatC(pool_obj_CIR$pars$trt_6$ci[2],format="f",digits=3),")",sep="")`
between the two groups at the last visit with an associated p-value `r ifelse(pool_obj_CIR$pars$trt_6$pval<0.001,"lower than 0.001", paste("of", formatC(pool_obj_CIR$pars$trt_6$pval,format="f",digits=3)))`.
# Efficiently changing reference-based imputation strategies
The `draws()` function is by far the most computationally intensive function in `rbmi`.
In some settings, it may be important to explore the impact of a change in the
reference-based imputation strategy on the results.
Such a change does not affect the imputation model but it does
affect the subsequent imputation step.
In order to allow changes in the imputation strategy without having to re-run the
`draws()` function, the function `impute()` has an additional argument `update_strategies`.
However, please note that this functionality comes with some important limitations:
As described at the beginning of Section \@ref(sec:postICEobs), post-ICE outcomes are included in the input dataset for the base imputation model if the imputation method is `MAR` but they are excluded for reference-based imputation methods (such as `CIR`, `CR`, and `JR`).
Therefore, `updata_strategies` cannot be applied if the imputation strategy is changed from a MAR to a non-MAR strategy in the presence of observed post-ICE outcomes. Similarly, a change from a non-MAR strategy to MAR triggers a warning in the presence of observed post-ICE outcomes because the base imputation model was not fitted to all relevant data under MAR.
Finally, `update_strategies` cannot be applied if the timing of any of the ICEs is changed (in argument `data_ice`) in addition to the imputation strategy.
As an example, we described an analysis under a copy increments in reference (CIR) assumption in the previous section. Let's assume we want to change this strategy to a jump to reference imputation strategy for a sensitivity analysis. This can be efficiently implemented using `update_strategies` as follows:
```{r}
# Change ICE strategy from CIR to JR
data_ice_JR <- data_ice_CIR %>%
mutate(strategy = ifelse(strategy == "CIR", "JR", strategy))
impute_obj_JR <- impute(
draw_obj,
references = c("Control" = "Control", "Intervention" = "Control"),
update_strategy = data_ice_JR
)
ana_obj_JR <- analyse(
impute_obj_JR,
vars = vars_an
)
pool_obj_JR <- pool(ana_obj_JR)
pool_obj_JR
```
For imputations under a jump to reference assumption, we get an estimated difference of
`r paste(formatC(pool_obj_JR$pars$trt_6$est,format="f",digits=3)," (95% CI ",
formatC(pool_obj_JR$pars$trt_6$ci[1],format="f",digits=3), " to ",
formatC(pool_obj_JR$pars$trt_6$ci[2],format="f",digits=3),")",sep="")`
between the two groups at the last visit with an associated p-value of
`r ifelse(pool_obj_JR$pars$trt_6$pval<0.001,"<0.001",formatC(pool_obj_JR$pars$trt_6$pval,format="f",digits=3))`.
# Imputation under MAR with time-varying covariates
@Guizzaro2021 suggested to implement a treatment policy strategy via imputation under a MAR assumption after conditioning on the subject's ICE status,
i.e. to impute missing post-ICE data based on observed post-ICE data. One possible implementation of this proposal is to add time-varying covariates to the imputation model.
A case study which implements this proposal and compares it to reference-based imputation methods for estimators in early Parkinson's disease can be found in @Noci2021.
In some settings, this may be carried out by including a binary time-varying indicator of the subject's ICE status at each visit (defined as 0 for pre-ICE visits and as 1 for post-ICE visits) to the imputation model. However, for the simulated data introduced in section \@ref(sec:dataSimul), it may be more plausible to assume that treatment discontinuation leads to a change in the "slope" of the mean outcome trajectory. This can be implemented by including a time-varying covariate which is equal to 0 for visits prior to the treatment discontinuation and equal to the time from the treatment discontinuation for subsequent visits. The regression coefficient of the corresponding change in the post-ICE "slope" should then be allowed to depend on the assigned treatment group, i.e. the imputation model should include an interaction between the time-varying covariate and the treatment group.
Let's first define the time-varying covariate:
```{r}
data <- data %>%
group_by(id) %>%
mutate(time_from_ice1 = cumsum(ind_ice1)*2/12 ) # multiplication by 2/12 because visits are bi-monthly
```
We can then include the time-varying covariate in the imputation model, crossed with the `group` variable:
```{r}
vars_tv <- set_vars(
subjid = "id",
visit = "visit",
outcome = "change",
group = "group",
covariates = c("visit*outcome_bl", "visit*group", "time_from_ice1*group"),
strategy = "strategy"
)
```
We now sequentially call the 4 key `rbmi` functions:
```{r}
draw_obj <- draws(
data = data,
data_ice = NULL, # if NULL, MAR is assumed for all missing data
vars = vars_tv,
method = method,
quiet = TRUE
)
impute_obj_tv <- impute(
draw_obj,
references = c("Control" = "Control", "Intervention" = "Intervention")
)
ana_obj_tv <- analyse(
impute_obj_tv,
vars = vars_an
)
pool(ana_obj_tv)
```
# Custom imputation strategies
The following imputation strategies are implemented in `rbmi`:
- Missing at Random (`MAR`)
- Jump to Reference (`JR`)
- Copy Reference (`CR`)
- Copy Increments in Reference (`CIR`)
- Last Mean Carried Forward (`LMCF`)
In addition, `rbmi` allows the user to implement their own imputation strategy.
To do this, the user needs to do three things:
1. Define a function implementing the new imputation strategy.
2. Specify which patients use this strategy in the `data_ice` dataset provided to `draws()`.
3. Provide the imputation strategy function to `impute()`.
The imputation strategy function must take 3 arguments (`pars_group`, `pars_ref`, and `index_mar`) and calculates the mean and covariance matrix of the subject's marginal imputation distribution which will then be applied to subjects to which the strategy applies.
Here, `pars_group` contains the predicted mean trajectory (`pars_group$mu`, a numeric vector) and covariance matrix (`pars_group$sigma`) for a subject conditional on their assigned treatment group and covariates.
`pars_ref` contains the corresponding mean trajectory and covariance matrix conditional on the reference group and the subject's covariates.
`index_mar` is a logical vector which specifies for each visit whether the visit is unaffected by an ICE handled using a non-MAR method or not.
As an example, the user can check how the CIR strategy was implemented by looking at function `strategy_CIR()`.
```{r}
strategy_CIR
```
To further illustrate this for a simple example, assume that a new strategy is to be implemented as follows:
- The marginal mean of the imputation distribution is equal to the marginal mean trajectory for the subject according to their assigned group and covariates up to the ICE.
- After the ICE the marginal mean of the imputation distribution is equal to the average of the visit-wise marginal means based on the subjects covariates and the assigned group or the reference group, respectively.
- For the covariance matrix of the marginal imputation distribution, the covariance matrix from the assigned group is taken.
To do this, we first need to define the imputation function which for this example could be coded as follows:
```{r}
strategy_AVG <- function(pars_group, pars_ref, index_mar) {
mu_mean <- (pars_group$mu + pars_ref$mu) / 2
x <- pars_group
x$mu[!index_mar] <- mu_mean[!index_mar]
return(x)
}
```
And an example showing its use:
```{r}
pars_group <- list(
mu = c(1, 2, 3),
sigma = as_vcov(c(1, 3, 2), c(0.4, 0.5, 0.45))
)
pars_ref <- list(
mu = c(5, 6, 7),
sigma = as_vcov(c(2, 1, 1), c(0.7, 0.8, 0.5))
)
index_mar <- c(TRUE, TRUE, FALSE)
strategy_AVG(pars_group, pars_ref, index_mar)
```
To incorporate this into `rbmi`, `data_ice` needs to be updated such that the strategy `AVG` is specified for visits affected by the ICE. Additionally, the function needs
to be provided to `impute()` via the `getStrategies()` function as shown below:
```{r}
data_ice_AVG <- data_ice_CIR %>%
mutate(strategy = ifelse(strategy == "CIR", "AVG", strategy))
draw_obj <- draws(
data = data,
data_ice = data_ice_AVG,
vars = vars,
method = method,
quiet = TRUE
)
impute_obj <- impute(
draw_obj,
references = c("Control" = "Control", "Intervention" = "Control"),
strategies = getStrategies(AVG = strategy_AVG)
)
```
Then, the analysis could proceed by calling `analyse()` and `pool()` as before.
# Custom analysis functions
By default `rbmi` will analyse the data by using the `ancova()` function.
This analysis function fits an ANCOVA model to the outcomes from each visit separately,
and returns the "treatment effect" estimate as well as the corresponding least square means
for each group. If the user wants to perform a different analysis, or return different
statistics from the analysis, then this can be done by using a custom analysis function.
Beware that the validity of the conditional mean imputation method has only been formally established for analysis functions corresponding to linear models (such as ANCOVA) and caution is
required when applying alternative analysis functions for this method.
The custom analysis function must take a `data.frame` as its
first argument and return a named `list` with each element itself being a `list`
containing at a minimum a point estimate, called `est`.
For method `method_bayes()` or `method_approxbayes()`, the list must additionally contain a
standard error (element `se`) and, if available, the degrees of freedom of the complete-data analysis model (element `df`).
As a simple example, we replicate the ANCOVA analysis at the last visit for the CIR-based imputations with a user-defined analysis function below:
```{r}
compare_change_lastvisit <- function(data, ...) {
fit <- lm(change ~ group + outcome_bl, data = data, subset = (visit == 6) )
res <- list(
trt = list(
est = coef(fit)["groupIntervention"],
se = sqrt(vcov(fit)["groupIntervention", "groupIntervention"]),
df = df.residual(fit)
)
)
return(res)
}
ana_obj_CIR6 <- analyse(
impute_obj_CIR,
fun = compare_change_lastvisit,
vars = vars_an
)
pool(ana_obj_CIR6)
```
As a second example, assume that for a supplementary analysis the user wants to compare the proportion of subjects with a change from baseline of >10 points at the last
visit between the treatment groups with the baseline outcome as an additional covariate. This could lead to the following basic analysis function:
```{r}
compare_prop_lastvisit <- function(data, ...) {
fit <- glm(
I(change > 10) ~ group + outcome_bl,
family = binomial(),
data = data,
subset = (visit == 6)
)
res <- list(
trt = list(
est = coef(fit)["groupIntervention"],
se = sqrt(vcov(fit)["groupIntervention", "groupIntervention"]),
df = NA
)
)
return(res)
}
ana_obj_prop <- analyse(
impute_obj_CIR,
fun = compare_prop_lastvisit,
vars = vars_an
)
pool_obj_prop <- pool(ana_obj_prop)
pool_obj_prop
tmp <- as.data.frame(pool_obj_prop) %>%
mutate(
OR = exp(est),
OR.lci = exp(lci),
OR.uci = exp(uci)
) %>%
select(parameter, OR, OR.lci, OR.uci)
tmp
```
Note that if the user wants `rbmi` to use a normal approximation to the pooled test statistics, then the degrees of freedom need to be set to `df = NA` (as per the above example). If the degrees of freedom of the complete data test statistics are known or if the degrees of freedom are set to `df = Inf`, then `rbmi` pools the degrees of freedom across imputed datasets according to the rule by Barnard and Rubin (see the "Statistical Specifications" vignette (`vignette("stat_specs", package = "rbmi")` for details). According to this rule, infinite degrees of freedom for the complete data analysis do not imply that the pooled degrees of freedom are also infinite.
Rather, in this case the pooled degrees of freedom are `(M-1)/lambda^2`, where `M` is the number of imputations and `lambda` is the fraction of missing information (see @Barnard1999 for details).
# Sensitivity analyses: Delta adjustments and tipping point analyses
Delta-adjustments are used to impute missing data under a not missing at random (NMAR) assumption. This reflects the belief that unobserved outcomes would have been systematically "worse" (or "better") than "comparable" observed outcomes. For an extensive discussion of delta-adjustment methods, we refer to @CroEtAlTutorial2020.
In `rbmi`, a marginal delta-adjustment approach is implemented. This means that the delta-adjustment is applied to the dataset after data imputation under MAR or reference-based missing data assumptions and prior to the analysis of the imputed data.
Sensitivity analysis using delta-adjustments can therefore be performed without having to re-fit the imputation model. In `rbmi`, they are implemented via the `delta` argument of the `analyse()` function.
## Simple delta adjustments and tipping point analyses
The `delta` argument of `analyse()` allows users to modify the outcome variable prior to the analysis.
To do this, the user needs to provide a `data.frame` which contains columns for the subject and visit (to identify the observation to be adjusted) plus an additional column called `delta` which specifies the value which will be added to the outcomes prior to the analysis.
The `delta_template()` function supports the user in creating this `data.frame`: it creates a skeleton `data.frame` containing one row per subject and visit with the value of `delta` set to 0 for all observations:
```{r}
dat_delta <- delta_template(imputations = impute_obj_CIR)
head(dat_delta)
```
Note that the output of `delta_template()` contains additional information which can be used to properly re-set variable `delta`.
For example, assume that the user wants to implement a delta-adjustment to the imputed values under CIR described in section \@ref(sec:postICEobs).
Specifically, assume that a fixed "worsening adjustment" of +5 points is applied to all imputed values regardless of the treatment group. This could be programmed as follows:
```{r}
# Set delta-value to 5 for all imputed (previously missing) outcomes and 0 for all other outcomes
dat_delta <- delta_template(imputations = impute_obj_CIR) %>%
mutate(delta = is_missing * 5)
# Repeat the analyses with the delta-adjusted values and pool results
ana_delta <- analyse(
impute_obj_CIR,
delta = dat_delta,
vars = vars_an
)
pool(ana_delta)
```
The same approach can be used to implement a tipping point analysis. Here, we apply different delta-adjustments to imputed data from the control and the intervention group, respectively. Assume that delta-adjustments by less then -5 points or by more than +15 points are considered implausible from a clinical perspective. Therefore, we vary the delta-values in each group between -5 to +15 points to investigate which delta combinations lead to a "tipping" of the primary analysis result, defined here as an analysis p-value $\geq 0.05$.
```{r fig.width=5, fig.height=4}
perform_tipp_analysis <- function(delta_control, delta_intervention) {
# Derive delta offset based on control and intervention specific deltas
delta_df <- delta_df_init %>%
mutate(
delta_ctl = (group == "Control") * is_missing * delta_control,
delta_int = (group == "Intervention") * is_missing * delta_intervention,
delta = delta_ctl + delta_int
)
ana_delta <- analyse(
impute_obj_CIR,
fun = compare_change_lastvisit,
vars = vars_an,
delta = delta_df,
)
pool_delta <- as.data.frame(pool(ana_delta))
list(
trt_effect_6 = pool_delta[["est"]],
pval_6 = pool_delta[["pval"]]
)
}
# Get initial delta template
delta_df_init <- delta_template(impute_obj_CIR)
tipp_frame_grid <- expand.grid(
delta_control = seq(-5, 15, by = 2),
delta_intervention = seq(-5, 15, by = 2)
) %>%
as_tibble()
tipp_frame <- tipp_frame_grid %>%
mutate(
results_list = map2(delta_control, delta_intervention, perform_tipp_analysis),
trt_effect_6 = map_dbl(results_list, "trt_effect_6"),
pval_6 = map_dbl(results_list, "pval_6")
) %>%
select(-results_list) %>%
mutate(
pval = cut(
pval_6,
c(0, 0.001, 0.01, 0.05, 0.2, 1),
right = FALSE,
labels = c("<0.001", "0.001 - <0.01", "0.01- <0.05", "0.05 - <0.20", ">= 0.20")
)
)
# Show delta values which lead to non-significant analysis results
tipp_frame %>%
filter(pval_6 >= 0.05)
ggplot(tipp_frame, aes(delta_control, delta_intervention, fill = pval)) +
geom_raster() +
scale_fill_manual(values = c("darkgreen", "lightgreen", "lightyellow", "orange", "red"))
```
According to this analysis, the significant test result from the primary analysis under CIR could only be tipped to a non-significant result for rather extreme delta-adjustments. Please note that for a real analysis it is recommended to use a smaller step size in the grid than what has been used here.
## More flexible delta-adjustments using the `dlag` and `delta` arguments of `delta_template()`
So far, we have only discussed simple `delta` arguments which add the same value to all imputed values.
However, the user may want to apply more flexible delta-adjustments to missing values after an intercurrent event (ICE) and vary the magnitude of the delta adjustment depending on the how far away the visit in question is from the ICE visit.
To facilitate the creation of such flexible delta-adjustments, the `delta_template()` function has two optional additional arguments `delta`
and `dlag`. The `delta` argument specifies the default amount of delta
that should be applied to each post-ICE visit, whilst
`dlag` specifies the scaling coefficient to be applied based upon the visits proximity
to the first visit affected by the ICE. By default, the delta will only be added to unobserved (i.e. imputed) post-ICE
outcomes but this can be changed by setting the optional argument `missing_only = FALSE`.
The usage of the `delta` and `dlag` arguments is best illustrated with a few examples:
Assume a setting with 4 visits and that the user specified `delta = c(5,6,7,8)` and `dlag=c(1,2,3,4)`.
For a subject for whom the first visit affected by the ICE is visit 2, these values of `delta` and `dlag` would imply the following delta offset:
```
v1 v2 v3 v4
--------------
5 6 7 8 # delta assigned to each visit
0 1 2 3 # scaling starting from the first visit after the subjects ICE
--------------
0 6 14 24 # delta * scaling
--------------
0 6 20 44 # cumulative sum (i.e. delta) to be applied to each visit
```
That is, the subject would have a delta offset of 0 applied to visit v1, 6 for visit v2, 20 for visit v3 and 44 for visit v4.
Assume instead, that the subject's first visit affected by the ICE was visit 3. Then, the above values of `delta` and `dlag` would imply the following delta offset:
```
v1 v2 v3 v4
--------------
5 6 7 8 # delta assigned to each visit
0 0 1 2 # scaling starting from the first visit after the subjects ICE
--------------
0 0 7 16 # delta * scaling
--------------
0 0 7 23 # cumulative sum (i.e. delta) to be applied to each visit
```
To apply a constant delta value of +5 to all visits affected by the ICE
regardless of their proximity to the first ICE visit, one could set `delta = c(5,5,5,5)` and `dlag = c(1,0,0,0)`.
Alternatively, it may be more straightforward for this setting to call the `delta_template()` function without the `delta` and `dlag` arguments and then overwrite the `delta` column of the resulting `data.frame` as described in the previous section (and additionally relying on the `is_post_ice` variable).
Another way of using these arguments is to set `delta` to the difference in time
between visits and dlag to be the amount of delta per unit of time. For example,
let's say that visits occur on weeks 1, 5, 6 and 9 and that we want a delta of 3
to be applied for each week after an ICE.
For simplicity, we assume that the ICE occurs immediately after the subject's last visit which
is not affected by the ICE. This this could be achieved by setting
`delta = c(1,4,1,3)` (the difference in weeks between each visit) and `dlag = c(3, 3, 3, 3)`.
Assume a subject's first visit affected by the ICE was visit v2, then these values of `delta` and `dlag` would imply the following delta offsets:
```
v1 v2 v3 v4
--------------
1 4 1 3 # delta assigned to each visit
0 3 3 3 # scaling starting from the first visit after the subjects ICE
--------------
0 12 3 9 # delta * scaling
--------------
0 12 15 24 # cumulative sum (i.e. delta) to be applied to each visit
```
To wrap up, we show this in action for our simulated dataset from section \@ref(sec:dataSimul) and the imputed datasets
based on a CIR assumption from section \@ref(sec:postICEobs).
The simulation setting specified follow-up visits at months 2, 4, 6, 8, 10, and 12.
Assume that we want to apply a delta-adjustment of 1 for every month after an ICE to unobserved post-ICE visits from the intervention group only. (E.g. if the ICE occurred immediately after the month 4 visit, then the total delta applied to a missing value from the month 10 visit would be 6.)
To program this, we first use the `delta` and `dlag` arguments of `delta_template()` to set up a corresponding template `data.frame`:
```{r}
delta_df <- delta_template(
impute_obj_CIR,
delta = c(2, 2, 2, 2, 2, 2),
dlag = c(1, 1, 1, 1, 1, 1)
)
head(delta_df)
```
Next, we can use the additional metadata variables provided by `delta_template()` to manually
reset the delta values for the control group back to 0:
```{r}
delta_df2 <- delta_df %>%
mutate(delta = if_else(group == "Control", 0, delta))
head(delta_df2)
```
Finally, we can use our delta `data.frame` to apply the desired delta offset to our analysis:
```{r}
ana_delta <- analyse(impute_obj_CIR, delta = delta_df2, vars = vars_an)
pool(ana_delta)
```
# References