/
C-composition-MMLM.Rmd
executable file
·205 lines (172 loc) · 7.85 KB
/
C-composition-MMLM.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
---
title: "Multilevel Model with Compositional Outcomes"
output:
html_document:
theme: spacelab
highlight: kate
toc: yes
toc_float: yes
collapsed: no
smooth_scroll: no
toc_depth: 4
fig_width: 6
fig_height: 4
fig_caption: yes
number_sections: true
vignette: >
%\VignetteIndexEntry{Multilevel Model with Compositional Outcomes}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
In this vignette, we discuss how to specify multilevel models with compositional outcomes using `multilevelcoda`.
In addition to `multilevelcoda`, we will use `brms` package (to fit models) and
`bayestestR` package (to compute useful indices and compare models).
We will also attach built in datasets `mcompd` (simulated compositional sleep and wake variables)
and `sbp` (sequential binary partition).
```r
library(multilevelcoda)
library(brms)
library(bayestestR)
data("mcompd")
data("sbp")
options(digits = 3)
```
# Multilevel model with compositional outcomes.
## Computing compositions and isometric log ratio coordinates.
The ILR coordinates outcomes can be calculated using the `compilr()` functions.
```r
cilr <- compilr(data = mcompd, sbp = sbp,
parts = c("TST", "WAKE", "MVPA", "LPA", "SB"), idvar = "ID", total = 1440)
head(cilr$TotalILR)
#> ilr1 ilr2 ilr3 ilr4
#> [1,] 0.287 1.20 0.6270 1.702
#> [2,] -0.472 1.57 -0.8336 0.984
#> [3,] -0.486 1.33 1.3344 2.659
#> [4,] -0.316 1.37 -0.0332 0.551
#> [5,] 0.205 1.43 -0.6893 0.733
#> [6,] -0.446 1.16 -0.0950 0.670
#> attr(,"class")
#> [1] "rmult"
```
## Fitting model
A model with multilevel compositional outcomes is multivariate, as it has multiple ILR coordinate outcomes,each of which is predicted by a set of predictors.
Our `brms` model can be then fitted using the `brmcoda()` function.
```r
mv <- brmcoda(compilr = cilr,
formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ Stress + (1 | ID),
cores = 8, seed = 123, backend = "cmdstanr")
#> Warning: In the future, 'rescor' will be set to FALSE by default for all models. It is thus recommended to explicitely set 'rescor'
#> via 'set_rescor' instead of using the default.
```
Here is a `summary()` of the model.
We can see that stress significantly predicted `ilr1` and `ilr2`.
```r
summary(mv)
#> Family: MV(gaussian, gaussian, gaussian, gaussian)
#> Links: mu = identity; sigma = identity
#> mu = identity; sigma = identity
#> mu = identity; sigma = identity
#> mu = identity; sigma = identity
#> Formula: ilr1 ~ Stress + (1 | ID)
#> ilr2 ~ Stress + (1 | ID)
#> ilr3 ~ Stress + (1 | ID)
#> ilr4 ~ Stress + (1 | ID)
#> Data: tmp (Number of observations: 3540)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Group-Level Effects:
#> ~ID (Number of levels: 266)
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(ilr1_Intercept) 0.33 0.02 0.30 0.37 1.00 1203 1547
#> sd(ilr2_Intercept) 0.30 0.01 0.28 0.34 1.00 1097 2116
#> sd(ilr3_Intercept) 0.39 0.02 0.35 0.43 1.00 1594 2478
#> sd(ilr4_Intercept) 0.30 0.02 0.27 0.33 1.00 1687 2483
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> ilr1_Intercept -0.43 0.02 -0.48 -0.38 1.00 1089 1836
#> ilr2_Intercept 1.47 0.02 1.42 1.51 1.00 877 2029
#> ilr3_Intercept -0.87 0.03 -0.93 -0.82 1.00 1644 2583
#> ilr4_Intercept 0.65 0.02 0.60 0.70 1.00 1948 2670
#> ilr1_Stress -0.01 0.00 -0.02 -0.00 1.00 6441 3679
#> ilr2_Stress 0.01 0.00 0.00 0.01 1.00 5245 3596
#> ilr3_Stress 0.00 0.00 -0.01 0.01 1.00 6436 3519
#> ilr4_Stress 0.01 0.00 -0.00 0.01 1.00 6209 3303
#>
#> Family Specific Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma_ilr1 0.44 0.01 0.43 0.45 1.00 5231 3458
#> sigma_ilr2 0.38 0.00 0.37 0.39 1.00 5176 3232
#> sigma_ilr3 0.70 0.01 0.68 0.71 1.00 5339 3388
#> sigma_ilr4 0.53 0.01 0.51 0.54 1.00 5250 3363
#>
#> Residual Correlations:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> rescor(ilr1,ilr2) -0.54 0.01 -0.57 -0.52 1.00 5202 3296
#> rescor(ilr1,ilr3) -0.18 0.02 -0.21 -0.14 1.00 4833 3061
#> rescor(ilr2,ilr3) -0.05 0.02 -0.09 -0.02 1.00 5156 3254
#> rescor(ilr1,ilr4) 0.11 0.02 0.07 0.14 1.00 4910 3445
#> rescor(ilr2,ilr4) -0.05 0.02 -0.08 -0.01 1.00 5286 3449
#> rescor(ilr3,ilr4) 0.56 0.01 0.53 0.58 1.00 5255 3316
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
```
# Bayes Factor for compositional multilevel modelling
We are often interested in whether a predictor significantly predict the overall composition, in addition to the individual ILR coordinates.
In Bayesian, this can be done by comparing the marginal likelihoods of two models.
Bayes Factors (BFs) are indices of relative evidence of one model over another.
In the context of compositional multilevel modelling, Bayes Factors provide two main useful functions:
- Testing single parameters within a model
- Comparing models
We can utilize Bayes factors to answer the following question:
*"Which model (i.e., set of composition predictors, expressed as ILRs) is more likely to have produced the observed data?"*
Let's examine whether stress predicts the overall sleep-wake composition.
*Note*: To use Bayes factors, `brmsfit` models must be fitted with an additional non-default argument
`save_pars = save_pars(all = TRUE)`.
```r
# intercept only
mv0 <- brmcoda(compilr = cilr,
formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ 1 + (1 | ID),
iter = 6000, chains = 8, cores = 8, seed = 123, warmup = 1000,
backend = "cmdstanr", save_pars = save_pars(all = TRUE))
#> Warning: In the future, 'rescor' will be set to FALSE by default for all models. It is thus recommended to explicitely set 'rescor'
#> via 'set_rescor' instead of using the default.
# full model
mv <- brmcoda(compilr = cilr,
formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ Stress + (1 | ID),
iter = 6000, chains = 8, cores = 8, seed = 123, warmup = 1000,
backend = "cmdstanr", save_pars = save_pars(all = TRUE))
#> Warning: In the future, 'rescor' will be set to FALSE by default for all models. It is thus recommended to explicitely set 'rescor'
#> via 'set_rescor' instead of using the default.
```
We can now compare these models with the `bayesfactor_models()` function
```r
bayes_factor(mv$Model, mv0$Model)
#> Iteration: 1
#> Iteration: 2
#> Iteration: 3
#> Iteration: 4
#> Iteration: 5
#> Iteration: 6
#> Iteration: 7
#> Iteration: 8
#> Iteration: 9
#> Iteration: 1
#> Iteration: 2
#> Iteration: 3
#> Iteration: 4
#> Iteration: 5
#> Iteration: 6
#> Iteration: 7
#> Iteration: 8
#> Iteration: 9
#> Iteration: 10
#> Estimated Bayes factor in favor of mv$Model over mv0$Model: 0.00001
```
With a $BF$ < 1, our data favours the intercept only model, showing that there is
insufficient evidence for stress predicting the overall sleep-wake composition.
Bayes factors provide a intuitive measure of the strength of evidence of one model over the other
or among different models. Check out the `bayestestR` packages for several other useful functions related to BFs.