-
Notifications
You must be signed in to change notification settings - Fork 7
/
Terminology.Rmd
221 lines (173 loc) · 10.8 KB
/
Terminology.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
title: "Terminology"
date: "Last edited: `r Sys.Date()`"
author: "Manuel Rademaker"
output:
prettydoc::html_pretty:
theme: cayman
highlight: github
toc: true
includes:
before_body: preamble.mathjax.tex
vignette: >
%\VignetteIndexEntry{csem-terminology}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
bibliography: ../inst/REFERENCES.bib
---
<!-- used to print boldface greek symbols (\mathbf only works for latin symbols) -->
```{r child="preamble-tex.tex", echo=FALSE, include=FALSE}
```
### Common factor{#commonfactor}
A common factor or latent variable is a type of [construct](#construct). The name common factor
is motivated by the theorized relationship to its [indicators](#indicator). This relation is commonly refered to as
the [measurement model](#mm). Two kinds of
measurement models exist: the reflective and the (causal-)formative measurement model.
The defining feature of a common factor in a reflective measurement model is the
idea that the common factor is the common underlying (latent) cause of the realizations
of a set of indicators that are said to measure the common factor.
The idea of the reflective measurement model is closely related to the
[true score](#truescore) theory. Accordingly, indicators related to the common factor
are modeled as measurement error-prone manifestations of a common variable,
cause or factor. Although there are subtle conceptional differences, the terms
common factor, true score, and latent variable are mostly used synommenoulsy in **cSEM**.
The common factor is also the central entity in the causal-formative
measurment model. Here the indicators are modeled as causing the/one common factor.
This type of measurement looks similar to the [composite (measurement) model](#composite), however,
both models are, in fact, quite different. While the causal-formative measurement
model assumes that the common factor is imperfectly measured by its indicators (i.e.,
there is an error term with non-zero variance) the composite in a composite model
is build/defined by its indicators, i.e., error-free by definition.
### Composite{#composite}
A composite is a weighted sum of [indicators](#indicator). Composites may
either serve as [constructs](#construct) in their own right or as [proxies](#proxy)
for a [latent variable](#latentvariable) which, in turn, serves as a
"statistical proxy" for a [concept](#concept) under study. The nature of the
composite is therefore defined by the type of (measurement)
model. If composites are error-free representations
of a concept we refer to the measurement model as the composite (measurement) model. If
composites are used as [stand-ins](#standin) for a [latent variable](#latentvariable),
the measurement model is called causal-formative [@Henseler2017].
Note that, although we sometimes use it this way as well, the term "measurement"
is acutally rather inadequate for the composite model since in a composite model
the construct is build/formed by its related indicators. Hence no measurement
in the acutal sense of the word takes place.
### Composite-based methods{#cbased}
Composite-based methods or composite-based SEM refers to the entirety of methods
centered around the use as of [composites](#composite) (linear compounts of observables)
as [stand-ins](#standin) or error-free representations for the [concepts](#concept) under investigation.
Composite-based methods are often also called
variance-based methods as focal parameters are usually retrived such that explained
variance of the dependend constructs in a structural model is maximized.
### Composite-based SEM{#cbasedsem}
See: [Composite-based methods](#cbased)
### Concept{#concept}
An entity defined by a conceptual/theoretical definition. In line with @Rigdon2016
variables representing or subsuming a concept are called conceptual variables.
The precise nature/meaning of a conceptual variable depends on
"differnt assumptions, philosophies or worldviews [of the researcher]"
[@Rigdon2016, p. 2]. Unless otherwise stated, in **cSEM**, it is sufficient
to think of concepts as entities that exist simply because they
have been defined. Hence, abstract terms such as "loyalty" or "depression" as
well as designed entities (artifacts) such as the "OECD Better Life Index"
are covered by the definition.
### Construct{#construct}
Construct refers to a representation of a [concept](#concept) within a given statistical
model. While a concept is defined by a conceptual (theoretical) definition,
a construct for a concept is defined/created by the researcher's operationalization
of a concept within a statistical model. Concepts are either modeled
as common factors/latent variables or as composites. Both operationalizations
- the common factor and the composite - are called constructs in **cSEM**.
As opposed to concepts, constructs therefore exist because they arise as
the result of the act of modeling their relation to the observable
variables (indicators) based on a specific set of assumptions.
Constructs may therefore best be understood as [stand-ins](#standin), i.e. statistical proxies for concepts. Consequently,
constructs do not necessarily represent the concept they seek to represent, i.e.,
there may be a validity gap.
### Covariance-based SEM
See: [Factor-based methods](#fbmethods)
### Factor-based methods{#fbmethods}
Factor-based methods or factor-based SEM refers to the entirety of methods
centered around the use of [common factors](#commonfactor) as statistical
[proxies](#proxy) for the [concepts](#concept) under investigation. Factor-based methods are also called
covariance-based methods as focal parameters are retrived such that the
difference between the model-implied $\bm\Sigma(\theta)$ and the empirical indicator covariance
matrix $\bm S$ is minimized.
### Indicator{#indicator}
An observable variable. In **cSEM** observable variables are generally refered
to as indicators, however, terms such as item, manifest variable, or
observable (variable) are sometimes used synommenoulsy.
### Latent variable{#latentvariable}
See: [Common factor](#commonfactor)
### Measurement model{#mm}
The measurement model is a statistical model relating [indicators](#indicator) to [constructs](#construct)
(the statistical representiation of a [concept](#concept)). If the concept under
study is modeled as a [common factor](#commonfactor) two measurement models exist:
- The reflective measurement model
- The causal-formative measurement model
If the concept under study is modeled as a [composite](#composite) we call the
measurement model:
- composite (measurement) model
Note that, although we sometimes use it this way as well, the term "measurement"
is acutally rather inadequate for the composite model since in a composite model
the construct is build/formed by its related indicators. Hence no measurement
in the acutal sense of the word takes place.
### Model
There are many ways to define the term "model". In **cSEM** we use the term model to
refer to both the theoretical and the statistical model.
Simply speaking theoretical models are a formalized set of hypotheses stating if and
how entities (observable or unobservable) are related. Since theoretical models are
by definition theoretical any statistical analysis inevitably mandates a statistical
model.
A statistical model is typically defined by a set of (testable) restrictions. Statistical
models are best understood as the operationalized version of the theoretical model.
Note that the act of operationalizing a given theoretical model always entails
the possibility for error. Using a [construct](#construct) modeled as a [composite](#composite)
or a [common factor](#commonfactor), for instance, is an attempt to map a theoretical entity (the [concept](#concept))
from the theoretical space into the statistical space. If this mapping is not
one-to-one the construct is only an imperfect representation of the concept.
Consequently, there is a validity gap.
NOTE: Note that we try to refrain from using the term "model" when describing
an estimation approach or algorithm such as partial least squares (PLS) as this
helps to clearly distinguish between the model and the approach used to estimate
*a given model*.
### Test score{#testscore}
A [proxy](#proxy) for a [true score](#truescore). Usually, the test score is a simple (unweighted)
sum score of [observables/indicators](#indicators), i.e. unit weights are assumed when
building the test score. More generally,
the test score can be any weighted sum of observables (i.e. a [composite](#composite)),
however, the term "test score" is historically closely tied to the idea that it is
indeed a simple sum score. Hence, whenever it is important to distinguish
between a true score as representing a sum score and a true score
as representing a weighted sum of indicators (where indicator weights a not necessarily
one) we will explicitly state what kind of test score we mean.
### True score{#truescore}
The term true score derives from the true score theory which theorizes/models
an [indicator](#indicator) or outcome as the sum of a true score and an error score. The term is
closely linked to the [latent variable/common factor](#commonfactor) model in that the true
score of a set of indicators are linear functions of some underlying common factor.
Mathematically speaking, the correspondence is $\eta_{jk} = \lambda_{jk}\eta_j$
where $\eta_{jk}$ is the true score, $\eta_j$ the underlying latent variable and $\lambda_{jk}$ the loading.
Despite some differences, the term true score can generally be used synommenoulsy
to the terms common factor and latent variable in **cSEM** without risking a misunderstanding.
### Proxy {#proxy}
Any quantity that functions as a representation of some other quantity.
Prominent proxies are the [test score](#testscore) - which serves as a stand-in/proxy
for the [true score](#truescore) - and the [composite](#composite) - which serves as a stand-in/proxy for a common factor if not used as a composite in its own right.
Proxies are usually - but not necessarily - error-prone representations
of the quantity they seek to represent.
### Saturated and non-saturated models
A structural model is called "saturated" if all [constructs](#construct) of the model are
allowed to freely covary. This is equivalent to saying that none of the path
of the structural model are restricted to zero. A saturated model has zero degrees
of freedom and hence carries no testable restrictions. If at least one path is
restricted to zero, the structural model is called "non-saturated".
### Stand-in{#standin}
See: [Proxy](#proxy)
### Structural Equation Modeling (SEM)
The entirety of a set of related theories, mathematical models, methods,
algorithms and terminologies related to analyzing the relationships
between [concepts](#concept) and/or observables.
### Variance-based methods {#vbmethods}
See: [Composite-based methods](#cbased)
## Literatur