/
marginal_tidiers.Rmd
767 lines (598 loc) · 26.9 KB
/
marginal_tidiers.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
---
title: "Marginal effects / slopes, contrasts, means and predictions with broom.helpers"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
rows.print = 25
)
# several packages required to compute the vignette
library(broom.helpers)
if (
!.assert_package("gtsummary", boolean = TRUE) ||
!.assert_package("ggstats", boolean = TRUE) ||
!.assert_package("margins", boolean = TRUE) ||
!.assert_package("emmeans", boolean = TRUE) ||
!.assert_package("effects", boolean = TRUE) ||
!.assert_package("marginaleffects", boolean = TRUE) ||
!.assert_package("ggeffects", boolean = TRUE) ||
!.assert_package("dplyr", boolean = TRUE) ||
!.assert_package("scales", boolean = TRUE)
) {
knitr::opts_chunk$set(eval = FALSE)
}
```
## Terminology
The overall idea of "marginal effects" is too provide tools to better interpret the results of a model by estimating several quantities at the margins. However, it has been implemented in many different ways by different ways and there is a bunch of quasi-synonyms for the idea of "marginal effects": statistical effects, marginal effects, marginal means, contrasts, marginal slopes, conditional effects, conditional marginal effects, marginal effects at the mean, and many other similarly-named ideas.
In `{broom.helpers}`, we tried to adopt a terminology consistent with the [`{marginaleffects}`](https://vincentarelbundock.github.io/marginaleffects/#definitions) package, first released in September 2021, and with [Andrew Heiss' Marginalia blog post](https://www.andrewheiss.com/blog/2022/05/20/marginalia/) published in May 2022.
**Adjusted Predictions** correspond to the outcome predicted by a fitted model on a specified scale for a given combination of values of the predictor variables, such as their observed values, their means, or factor levels (a.k.a. "reference grid"). When prediction are averaged according to a specific regressor, we will then refer to **Marginal Predictions**.
**Marginal Contrasts** are referring to a comparison (e.g. difference) of the outcome for a certain regressor, considering "meaningfully" or "typical" values for the other predictors (at the mean/mode, at custom values, averaged over observed values...). Contrasts could be computed for categorical variables (e.g. difference between two specific levels) or for continuous variables (change in the outcome for a certain change of the regressor).
**Marginal Effects** **/ Slopes** are defined for continuous variables as a partial derivative (slope) of the regression equation with respect to a regressor of interest. Put differently, the marginal effect is the slope of the prediction function, measured at a specific value of the regressor of interest. In scientific practice, the marginal effects fall in the same toolbox as the marginal contrasts.
**Marginal Means** are adjusted predictions of a model, averaged across a "reference grid" of categorical predictors. They are similar to marginal predictions, but with subtle differences.
`{broom.helpers}` embed several custom tidiers to compute such quantities and to return a tibble compatible with `tidy_plus_plus()` and all others `{broom.helpers}`'s `tidy_*()` function. Therefore, it is possible to produce nicely formatted tables with `gtsummary::tbl_regression()` or forest plots with `ggstats::ggcoef_model()`.
## Data preparation
Let's consider the `trial` dataset from the `{gtsummary}` package and build a logistic regression model with two categorical predictors (`trt` and `stage`) and two continuous predictor (`marker` and `age`). We will include an interaction between `trt` and `marker` and polynomial terms for `age` (i.e. `age` and `age^2`).
```{r, message=FALSE}
library(broom.helpers)
library(gtsummary)
library(dplyr)
d <- trial %>%
filter(complete.cases(response, trt, marker, grade, age))
mod <- glm(
response ~ trt * marker + stage + poly(age, 2),
data = d,
family = binomial
)
mod %>%
tbl_regression(
exponentiate = TRUE
) %>%
bold_labels()
```
## Marginal Predictions
### Marginal Predictions at the Mean
A first approach to better understand / interpret the model consists to predict the value of a regressor, on the model scale, at "typical values" of the other regressors. The estimates are therefore easier to interpret, as they are expressed on the the scale of the outcome (here, for a binary logistic regression, as probabilities). The differences observed between the predictions at different modalities will depend only on the "effect" of that regressor as the others regressors will be fixed at the same "typical values". However, all packages do not use the same definition of "typical values".
#### the `{effects}`'s approach
The `{effects}` package offer an `effects::Effect()` function to compute marginal predictions at typical values. Although the function is named `Effect()`, the produced estimates are marginal predictions according to the terminology presented at the beginning of this vignette.
```{r}
library(effects, quietly = TRUE)
e <- Effect("stage", mod)
e
plot(e)
```
To understand what are the "typical values" used by `effects::Effect()`, let's have a look at the model matrix generated by the package and used for predictions.
```{r}
e$model.matrix
```
The other continuous regressors are set to their observed mean while the other categorical regressors are weighted according to their observed proportions. Somehow, an artificial "averaged" individual is created, of mean age and mean marker level, and being partly receiving Drug A and Drug B. And then, we predict the probability of `response` if this individual would be in stage T1, T2, T3 or T4.
For a continuous variable, `effects::Effect()` will consider several values of the regressor (based on the range of observed values) to estimate marginal predictions at these different values.
```{r}
e2 <- Effect("age", mod)
e2
plot(e2)
```
The `effects::allEffects()` will build all marginal predictions of all regressors, taking into account eventual interactions within the model.
```{r}
allEffects(mod)
plot(allEffects(mod))
```
It is also possible to generate similar plots with `ggeffects::ggeffect()`. Please note that `ggeffects::ggeffect()` will consider, by default, only individual variables from the model and not existing interactions.
```{r}
mod %>%
ggeffects::ggeffect() %>%
lapply(plot) %>%
patchwork::wrap_plots()
```
To generate a tibble of these results formatted in a way that it could be use with `tidy_plus_plus()` and other `{broom.helpers}`'s `tidy_*()` helpers, `{broom.helpers}` provides a `tidy_all_effects()` tieder.
```{r}
tidy_all_effects(mod)
```
It is therefore very easy to produce a nicely formatted table with `gtsummary::tbl_regression()` or a forest plot with `ggstats::ggcoef_model()`.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_all_effects,
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
bold_labels()
```
```{r}
ggstats::ggcoef_model(
mod,
tidy_fun = tidy_all_effects,
vline = FALSE
)
```
#### the `{marginaleffects}`'s approach at the Mean
The `{marginaleffects}` package allows to compute marginal predictions "at the mean", i.e. by considering the mean of the other continuous regressors and the mode (i.e. the most frequent observed modality) of categorical regressors. For that, we should call `marginaleffects::predictions()` with `newdata = "mean"`.
```{r}
library(marginaleffects)
predictions(
mod,
variables = "stage",
newdata = "mean",
by = "stage"
)
```
Four "mean individuals" were generated, with just the value of `stage` being different from one individual to the other, before predicting the probability of `response`.
For a continuous variable, predictions will be made, by default, at Tukey's five numbers, i.e. the minimum, the first quartile, the median, the third quartile and the maximum.
```{r}
predictions(
mod,
variables = "age",
newdata = "mean",
by = "age"
)
```
`{broom.helpers}` provides a global tidier `tidy_marginal_predictions()` to compute the marginal predictions for each variable or combination of variables before stacking them in a unique tibble. You should specify `newdata = "mean"` to get marginal predictions at the mean. By default, as `effects::allEffects()`, it will consider all higher order combinations of variables (as identified with `model_list_higher_order_variables()`).
```{r}
mod %>%
model_list_higher_order_variables()
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_predictions,
newdata = "mean",
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
Simply specify `variables_list = "no_interaction"` to compute marginal predictions for each individual variable without considering existing interactions.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_predictions,
variables_list = "no_interaction",
newdata = "mean",
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
`{broom.helpers}` also include `plot_marginal_predictions()` to generate a list of plots to visualize all marginal predictions. Use `patchwork::wrap_plots()` to combine all plots together.
```{r}
p <- plot_marginal_predictions(mod, newdata = "mean") %>%
patchwork::wrap_plots() &
ggplot2::scale_y_continuous(
labels = scales::label_percent(),
limits = c(-0.2, 1)
)
p + patchwork::plot_annotation(
title = "Marginal Predictions at the Mean"
)
```
```{r}
p <- mod %>%
plot_marginal_predictions(
"no_interaction",
newdata = "mean"
) %>%
patchwork::wrap_plots() &
ggplot2::scale_y_continuous(
labels = scales::label_percent(),
limits = c(-0.2, 1)
)
p + patchwork::plot_annotation(
title = "Marginal Predictions at the Mean"
)
```
Alternatively, you can use `ggstats::ggcoef_model()`, using `tidy_args` to pass arguments to `broom.helpers::tidy_marginal_predictions()`.
```{r}
ggstats::ggcoef_model(
mod,
tidy_fun = tidy_marginal_predictions,
tidy_args = list(newdata = "mean", variables_list = "no_interaction"),
vline = FALSE,
show_p_values = FALSE,
signif_stars = FALSE,
significance = NULL
)
```
### Average Marginal Predictions
Instead of averaging observed values to generate "typical observations" before predicting the outcome, an alternative consists to predict the outcome on the overall observed values before averaging the results.
More precisely, the purpose is to adopt a counterfactual approach. Let's take an example. Let's consider `d` our observed data used to estimate the model. We can make a copy of this dataset, where all variables would be identical, but considering that all individuals have received Drug A. Similarly, we could generate a dataset where all individuals would have received Drug B.
```{r}
dA <- d %>%
mutate(trt = "Drug A")
dB <- d %>%
mutate(trt = "Drug B")
```
We can now predict the outcome for all observations in `dA` and then compute the average, and similarly with `dB`.
```{r}
predict(mod, newdata = dA, type = "response") %>% mean()
predict(mod, newdata = dB, type = "response") %>% mean()
```
We, then, obtain **Average Marginal Predictions** for `trt`. The same results could be computed with `marginaleffects::avg_predictions()`. Note that the counterfactual approach corresponds to the default behavior when no value is provided to `newdata`.
```{r}
avg_predictions(mod, variables = "trt", by = "trt", type = "response")
```
**Important:** since version 0.10.0 of `marginaleffects`, we had to add `type = "response"` to get this result: for `glm` models, predictions are done on the response scale, before being averaged. If `type` are not specified, predictions will be made on the link scale, before being averaged and then back transformed on the response scale. Thus, the average prediction may not be exactly identical to the average of predictions.
```{r}
avg_predictions(mod, variables = "trt", by = "trt")
b <- binomial()
predict(mod, newdata = dA, type = "link") %>%
mean() %>%
b$linkinv()
predict(mod, newdata = dB, type = "link") %>%
mean() %>%
b$linkinv()
```
We can use `tidy_marginal_predictions()` to get average marginal predictions for all variables and `plot_marginal_predictions()` for a visual representation.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_predictions,
type = "response",
variables_list = "no_interaction",
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_predictions,
type = "response",
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
```{r}
p <- plot_marginal_predictions(mod, type = "response") %>%
patchwork::wrap_plots(ncol = 2) &
ggplot2::scale_y_continuous(
labels = scales::label_percent(),
limits = c(-0.2, 1)
)
p + patchwork::plot_annotation(
title = "Average Marginal Predictions"
)
```
### Marginal Means and Marginal Predictions at Marginal Means
The `{emmeans}` package adopted, by default, another approach based on **marginal means** or *estimated marginal means* (a.k.a. emmeans).
It will consider a grid of predictors with all combinations of the observed modalities of the categorical variables and fixing continuous variables at their means.
Let's call `marginaleffects::predictions()` with `newdata = "marginalmeans"`.
```{r}
pred <- predictions(mod, newdata = "marginalmeans")
pred
```
As we can see, `pred` contains 8 rows, one for each combination of `trt` (2 modalities) and `stage` (4 modalities). `age` is fixed at his mean (`mean(d$age)`) as well as `marker`.
Let's compute the average predictions for each value of `stage`.
```{r}
pred %>%
group_by(stage) %>%
summarise(mean(estimate))
```
We can check that we obtain the same estimates as with `emmeans::emmeans()`.
```{r}
emmeans::emmeans(mod, "stage", type = "response")
```
The function `marginaleffects::marginal_means()` allows to compute directly marginal means of all categorical variables.
```{r}
marginal_means(mod)
```
`{broom.helpers}` provides a `tidy_marginal_means()` tidier.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_means,
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
Marginal means are defined only for categorical variables. However, we can define **marginal predictions at marginal means** for both continuous and categorical variables, calling `tidy_marginal_predictions()` with the option `newdata = "marginalmeans"`. For categorical variables, marginal predictions at marginal means will be equal to marginal means.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_predictions,
newdata = "marginalmeans",
variables_list = "no_interaction",
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
### Alternative approaches
#### Marginal Predictions at the Median
They are similar to marginal predictions at the mean, except that continuous variables are fixed at the median of observed values (and categorical variables at their mode). Simply use `newdata = "median"`.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_predictions,
newdata = "median",
variables_list = "no_interaction",
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
modify_column_hide("p.value") %>%
bold_labels()
```
#### the `ggeffects::ggpredict()`'s approach
The `{ggeffects}` package offers a `ggeffects::ggpredict()` function which generates marginal predictions at the mean of continuous variables and at the first modality (used as reference) of categorical variables. `{broom.helpers}` provides a `tidy_ggpredict()` tidier.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_ggpredict,
estimate_fun = scales::label_percent(accuracy = .1)
) %>%
bold_labels()
```
```{r}
mod %>%
ggeffects::ggpredict() %>%
plot() %>%
patchwork::wrap_plots()
```
## Marginal Contrasts
Now that we have a way to estimate marginal predictions, we can easily compute **marginal contrasts**, i.e. difference between marginal predictions.
### Average Marginal Contrasts
Let's consider first a categorical variable, e.g. `stage`. Average Marginal Predictions are obtained with `marginaleffects::avg_predictions()`.
```{r}
pred <- avg_predictions(mod, variables = "stage", by = "stage", type = "response")
pred
```
The contrast between `"T2"` and `"T1"` is simply the difference between the two adjusted predictions:
```{r}
pred$estimate[2] - pred$estimate[1]
```
The `marginaleffects::avg_comparisons()` function allows to compute all differences between adjusted predictions.
```{r}
comp <- avg_comparisons(mod, variables = "stage")
comp
```
*Note:* in fact, `avg_comparisons()` has computed the contrasts for each observed values before averaging it. By construction, it is equivalent to the difference of the average marginal predictions.
As the contrast has been averaged over the observed values, we can call them **average marginal contrast**.
By default, each modality is contrasted with the first one taken as a reference.
```{r}
avg_comparisons(mod, variables = "stage")
```
Other types of contrasts could be specified using the `variables` argument.
```{r}
avg_comparisons(mod, variables = list(stage = "sequential"))
avg_comparisons(mod, variables = list(stage = "pairwise"))
```
Let's consider a continuous variable:
```{r}
avg_comparisons(mod, variables = "age")
```
By default, `marginaleffects::avg_comparisons()` computes, for each observed value, the effect of increasing `age` by one unit (comparing adjusted predictions when the regressor is equal to its observed value minus 0.5 and its observed value plus 0.5). It is possible to compute a contrast for another gap, for example the average difference for an increase of 10 years:
```{r}
avg_comparisons(mod, variables = list(age = 10))
```
Contrasts for all individual predictors could be easily obtained:
```{r}
avg_comparisons(mod)
```
It should be noted that column names are not consistent with other tidiers used by `broom.helpers`. Therefore, a `comparisons` object should not be passed directly to `tidy_plus_plus()`. Instead, you should use `broom.helpers::tidy_avg_comparisons()`.
```{r}
tidy_avg_comparisons(mod)
```
This custom tidier is compatible with `tidy_plus_plus()` and the suit of other functions provided by `{broom.helpers}`.
```{r}
mod %>%
tidy_plus_plus(tidy_fun = tidy_avg_comparisons)
```
A nicely formatted table can therefore be generated with `gtsummary::tbl_regression()`.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_avg_comparisons,
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
Similarly, a forest plot could be produced with `ggstats::ggcoef_model()`.
```{r}
ggstats::ggcoef_model(mod, tidy_fun = tidy_avg_comparisons) +
ggplot2::scale_x_continuous(
labels = scales::label_percent(style_positive = "plus")
)
```
### Marginal Contrasts at the Mean
Instead of computing contrasts for each observed values before averaging, another approach consist of considering an hypothetical individual whose characteristics correspond to the "average" before predicting results and computing contrasts.
It could be achieved with `{marginaleffects}` by using `newdata = "mean"`. In that case, it will consider an individual where continuous predictors are equal to the mean of observed values and where categorical predictors will be set to the mode (i.e. most frequent value) of the observed values.
```{r}
pred <- predictions(mod, variables = "trt", newdata = "mean")
pred
pred$estimate[2] - pred$estimate[1]
comparisons(mod, variables = "trt", newdata = "mean")
```
The `newdata` argument can be passed to `tidy_avg_comparisons()`, `tidy_plus_plus` or `gtsummary::tbl_regression()`.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_avg_comparisons,
newdata = "mean",
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
For `ggstats::ggcoef_model()`, use `tidy_args` to pass `newdata = "mean"`.
```{r}
mod %>%
ggstats::ggcoef_model(
tidy_fun = tidy_avg_comparisons,
tidy_args = list(newdata = "mean")
) +
ggplot2::scale_x_continuous(
labels = scales::label_percent(style_positive = "plus")
)
```
### Alternative approaches
Other assumptions, such as `"marginalmeans"` or `"median"`, could be defined using `newdata`. See the documentation of `marginaleffects::comparisons()`.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_avg_comparisons,
newdata = "marginalmeans",
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
```{r}
mod %>%
ggstats::ggcoef_model(
tidy_fun = tidy_avg_comparisons,
tidy_args = list(newdata = "marginalmeans")
) +
ggplot2::scale_x_continuous(
labels = scales::label_percent(style_positive = "plus")
)
```
### Dealing with interactions
In our model, we defined an interaction between `trt` and `marker`. Therefore, we could be interested to compute the contrast of `marker` for each value of `trt`.
```{r}
avg_comparisons(
mod,
variables = list(marker = 1),
newdata = datagridcf(trt = unique),
by = "trt",
)
```
Alternatively, it is possible to compute "cross-contrasts" showing what is happening when both `marker` and `trt` are changing.
```{r}
avg_comparisons(
mod,
variables = list(marker = 1, trt = "reference"),
cross = TRUE
)
```
The tidier `tidy_marginal_contrasts()` allows to compute directly several combinations of variables and to stack all the results in a unique tibble.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_contrasts,
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
```{r}
ggstats::ggcoef_model(mod, tidy_fun = tidy_marginal_contrasts)
```
By default, when there is an interaction, contrasts are computed for the last variable of the interaction according to the different values of the first variables (if one of this variable is continuous, using Tukey's five numbers).
The option `variables_list = "cross"` could be used to get "cross-contrasts" for interactions.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_contrasts,
variables_list = "cross",
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
The option `variables_list = "no_interaction"` could be used to get the average marginal contrasts for each variable without considering interactions.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_marginal_contrasts,
variables_list = "no_interaction",
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
```{r}
ggstats::ggcoef_model(
mod,
tidy_fun = tidy_marginal_contrasts,
tidy_args = list(variables_list = "no_interaction")
)
```
As before, to display marginal contrasts at the mean, indicate `newdata = "mean"`. For more information on the way to customize the combination of variables, see the documentation and examples of `tidy_marginal_contrasts()`.
## Marginal Effects / Marginal Slopes
Marginal effects are similar to marginal contrasts with a subtle difference. For a continuous regressor, a marginal contrast could be seen as a difference while a marginal effect is a partial derivative. Put differently, the marginal effect of a continuous regressor $x$ is the **slope** of the prediction function $y$, measured at a specific value of $x$, i.e. ${\partial y}/{\partial x}$.
Marginal effects are expressed according to the scale of the model and represent the expected change on the outcome for an increase of one unit of the regressor.
By definition, marginal effects are not defined for categorical variables, marginal contrasts being reported instead.
Like marginal contrasts, several approaches exist to compute marginal effects. For more details, see the [dedicated vignette](https://vincentarelbundock.github.io/marginaleffects/articles/marginaleffects.html) of the `{marginaleffects}` package.
### Average Marginal Effects (AME)
A marginal effect will be computed for each observed values before being averaged with `marginaleffects::avg_slopes()`.
```{r}
avg_slopes(mod)
```
Column names are not consistent with other tidiers used by `{broom.helpers}`. Use `tidy_avg_slopes()` instead.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_avg_slopes,
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
```{r}
mod %>%
ggstats::ggcoef_model(
tidy_fun = tidy_avg_slopes
) +
ggplot2::scale_x_continuous(
labels = scales::label_percent(style_positive = "plus")
)
```
Please note that for categorical variables, marginal contrasts are returned.
Same results could be obtained with `margins::margins()` function inspired by **Stata**'s `margins` command. As `margins::margins()` is not compatible with `stats::poly()`, we will rewrite our model, replacing `poly(age, 2)` by `age + age^2`.
```{r}
mod_alt <- glm(
response ~ trt * marker + stage + age + age^2,
data = d,
family = binomial
)
margins::margins(mod_alt) %>% tidy()
```
For `{broom.helpers}`, `{gtsummary}` or `{ggstats}`, use `tidy_margins()`.
```{r}
mod_alt %>%
tbl_regression(
tidy_fun = tidy_margins,
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
### Marginal Effects at the Mean (MEM)
For marginal effects at the mean[^1], simple use `newdata = "mean"`.
[^1]: More precisely, `marginaleffects::marginaleffects()` use the mean of continuous variables and the mode of categorical variables.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_avg_slopes,
newdata = "mean",
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
```{r}
mod %>%
ggstats::ggcoef_model(
tidy_fun = tidy_avg_slopes,
tidy_args = list(newdata = "mean")
) +
ggplot2::scale_x_continuous(
labels = scales::label_percent(style_positive = "plus")
)
```
### Marginal Effects at Marginal Means
Simply use `newdata = "marginalmeans"`.
```{r}
mod %>%
tbl_regression(
tidy_fun = tidy_avg_slopes,
newdata = "marginalmeans",
estimate_fun = scales::label_percent(style_positive = "plus")
) %>%
bold_labels()
```
```{r}
mod %>%
ggstats::ggcoef_model(
tidy_fun = tidy_avg_slopes,
tidy_args = list(newdata = "marginalmeans")
) +
ggplot2::scale_x_continuous(
labels = scales::label_percent(style_positive = "plus")
)
```
## Further readings
- [Documentation of the `marginaleffects` package](https://vincentarelbundock.github.io/marginaleffects/) by Vincent Arel-Bundock
- [Marginalia: A guide to figuring out what the heck marginal effects, marginal slopes, average marginal effects, marginal effects at the mean, and all these other marginal things are](https://www.andrewheiss.com/blog/2022/05/20/marginalia/) by Andrew Heiss
- [Introduction to Adjusted Predictions and Marginal Effects in R](https://strengejacke.github.io/ggeffects/articles/introduction_marginal_effects.html) by Daniel Lüdecke
- [An Introduction to `margins`](https://cran.r-project.org/package=margins/vignettes/Introduction.html)