-
Notifications
You must be signed in to change notification settings - Fork 0
/
modelbpp.Rmd
375 lines (286 loc) · 8.03 KB
/
modelbpp.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
---
title: "Get Started"
author: "Shu Fai Cheung"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
number_sections: true
vignette: >
%\VignetteIndexEntry{Get Started}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: references.bib
csl: apa.csl
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 6,
fig.height = 6,
fig.align = "center",
fig.path = ""
)
```
# Introduction
This article illustrates how to
use `model_set()` and other functions
from the package
[`modelbpp`](https://sfcheung.github.io/modelbpp/)
to:
- Fit a set of neighboring models,
each has one more or one less degree
of freedom than the original fitted
model.
- Compute the BIC posterior probability
(BPP), for each model [@wu_simple_2020].
- Use BPP to assess to what extent
each model is supported by the
data, compared to all other models
under consideration.
# Workflow
1. Fit an SEM model, the
original model, as usual in `lavaan`.
2. Call `model_set()` on the output from
Step 1. It will automatically do the
following:
- Enumerate the neighboring models
of the original model.
- Fit all the models and compute their
BIC posterior probabilities (BPPs).
3. Examine the results by:
- printing the output of `model_set()`, or
- generating a graph using `model_graph()`.
# Example
This is a sample dataset,
`dat_serial_4_weak`,
with four variables:
```{r}
library(modelbpp)
head(dat_serial_4_weak)
```
## Step 1: Fit the Original Model
Fit this original model, a serial
mediation model, with one direct path,
from `x` to `y`:
```{r}
library(lavaan)
mod1 <-
"
m1 ~ x
m2 ~ m1
y ~ m2 + x
"
fit1 <- sem(mod1, dat_serial_4_weak)
```
This the summary:
```{r}
summary(fit1,
fit.measures = TRUE)
```
```{r echo = FALSE}
tmp <- fitMeasures(fit1)
fit1_cfi <- unname(tmp["cfi"])
fit1_rmsea <- unname(tmp["rmsea"])
```
The fit is acceptable, though the RMSEA
is marginal (CFI = `r formatC(fit1_cfi, 3, format = "f")`,
RMSEA = `r formatC(fit1_rmsea, 3, format = "f")`).
## Step 2: Call `model_set()`
Use `model_set()` to find the
neighboring models differ
from the target model by one on model
degrees of freedom, fit them, and compute
the BPPs:
```{r results = FALSE}
out1 <- model_set(fit1)
```
## Step 3: Examine the Results
To examine the results, just print
the output:
```{r}
out1
```
```{r echo = FALSE}
out1_bpp <- out1$bpp
out1_bpp_2 <- sort(out1_bpp, decreasing = TRUE)[2]
```
The total number of models examined,
including the original model, is `r length(out1$models)`.
(Note: The total number of models
was 9 in previous version. Please
refer to the Note in the printout for
the changes.)
The BIC posterior probabilities
(BPPs) suggest that
the original model is indeed the most
probable model based on BPP. However,
the model with the direct path
dropped, `drop: y~x`, only
has slightly lower BPP
(`r formatC(out1_bpp_2, 3, format = "f")`)
This suggests that, with equal prior
probabilities [@wu_simple_2020], the
support for the model with the direct
and without the direct path have similar
support from the data based on BPP.
Alternatively, we can use `model_graph()`
to visualize the BPPs and model relations
graphically:
```{r graph1, fig.height = 8, fig.width = 8, eval = FALSE}
graph1 <- model_graph(out1)
plot(graph1)
```
![](graph1-1.png)
Each node (circle) represents one model.
The larger the BPP, the larger the node.
The arrow points from a simpler
model (a model with larger model *df*)
to a more complicated model (a model
with smaller model *df*). If two models
are connected by an arrow, then
one model can be formed from another model
by adding or removing one free parameter
(e.g., adding or removing one path).
## Repeat Step 2 with User Prior
In real studies, not all models are
equally probable before having data
(i.e., not all models have equal
prior probabilities). A researcher
fits the original model because
- its
prior probability is higher than other
models, at least other neighboring
models (otherwise, it is not worthy
of collecting data assess thi original
model), but
- the prior probability
is not so high to eliminate the need
for collecting data to see how much it is
supported by data.
Suppose we decide that the prior probability
of the original model is .50: probable, but
still needs data to decide whether it is
empirically supported
This can be done by setting `prior_sem_out`
to the desired prior probability when calling
`model_set()`:
```{r results = FALSE}
out1_prior <- model_set(fit1,
prior_sem_out = .50)
```
The prior probabilities of all other
models are equal. Therefore, with
nine models and the prior of the target
model being .50, the prior probability
of the other eight model is (1 - .50) / 8
or .0625.
This is the printout:
```{r}
out1_prior
```
If the prior of the target is set to .50,
then, taking into account both the prior
probabilities and the data, the target
model is strongly supported by the data.
This is the output of `model_graph()`:
```{r out1_prior, fig.height = 8, fig.width = 8, eval = FALSE}
graph1_prior <- model_graph(out1_prior)
plot(graph1_prior)
```
![](out1_prior-1.png)
# Advanced Options
## More Neighboring Models
If desired, we can enumerate models
"farther away" from the target model.
For example, we can set the
maximum difference
in model *df* to 2, to include models
having two more or two less *df* than
the original model:
```{r results = FALSE}
out1_df2 <- model_set(fit1,
df_change_add = 2,
df_change_drop = 2)
```
This is the printout. By default, when there
are more than 20 models, only the top 20
models on BPP will be printed:
```{r}
out1_df2
```
The number of models examined, including
the original model, is `r length(out1_df2$models)`.
This is the output of `model_graph()`:
```{r graph1_df2, fig.height = 8, fig.width = 8, eval = FALSE}
graph1_df2 <- model_graph(out1_df2,
node_label_size = .75)
plot(graph1_df2)
```
![](graph1_df2-1.png)
Note: Due to the number of nodes,
`node_label_size` is used to reduce
the size of the labels.
## Excluding Some Parameters From the Search
When calling `model_set()`, users can
specify parameters that must be excluded
from the list to be added (`must_not_add`),
or must not be dropped (`must_not_drop`).
For example, suppose it is well
established that `m1~x` exists and should
not be dropped, we can exclude it
when calling `model_set()`
```{r results = FALSE}
out1_no_m1_x <- model_set(fit1,
must_not_drop = "m1~x")
```
This is the output:
```{r}
out1_no_m1_x
```
The number of models reduced to `r length(out1_df2$models)`.
This is the plot:
```{r out1_no_m1_x, ig.height = 8, fig.width = 8, eval = FALSE}
out1_no_m1_x <- model_graph(out1_no_m1_x)
plot(out1_no_m1_x)
```
![](out1_no_m1_x-1.png)
## Models With Constraints
If the original model has equality
constraints, they will be included in
the search for neighboring models,
by default. That is, removing one
equality constraint between two models
is considered as a model with an
increase of 1 *df*.
## Recompute BPPs Without Refitting the Models
Users can examine the impact of the prior
probability of the original model without
refitting the models, by using
the output of `model_set()` as the
input, using the `model_set_out` argument:
```{r results = FALSE}
out1_new_prior <- model_set(model_set_out = out1,
prior_sem_out = .50)
```
The results are identical to calling
`model_set()` with the original `lavaan`
output as the input:
```{r}
out1_new_prior
```
## Many Neighboring Models
When a model has a lot of free parameters,
the number of neighboring models can be
large and it will take a long time to
fit all of them. Users can enable
parallel processing by setting `parallel`
to `TRUE` when calling `model_set()`.
## More Options
Please refer to the help page of
`model_set()` for options available.
# Further Information
For further information on other
functions,
please refer to their help pages.
# References