### 结构方程模型如何估计潜变量？

在结构方程模型中，所谓的测量模型几乎完全等同于另一个概念——验证性因素分析（Confirmatory Factor Analysis，CFA），其核心是如何基于一些可观测的变量测量一个更抽象、不可观测的变量。


测量模型是基于真分数理论提出的，认为可观测的分数（observed score，OS）=真分数（True score，TS）+测量误差（Measurement Error，e），假设，我用智力测验测量智商，智力测验就是OS，实际的智商就是TS，工具的误差就是e 

![image.png](attachment:image.png)

### 多个观测变量衡量一个抽象概念

#### 我们举生活满意度量表的例子

生活满意度是个概念，不是一言半语就能说明白程度的，也不会像身高体重有绝对明确的衡量指标。于是，心理学家尝试从操作定义的角度，用多道条目（item）衡量可能的行为，进而衡量行为背后的概念。

![image.png](attachment:image.png)

上图中，所有显变量的可以被分解为：

![image-2.png](attachment:image-2.png)

#### 两个值得注意的点：

1. 一个因子应该有多少个条目？

建议至少4个，当条目只有三个的时候，该测量模型（验证性因素模型）恰好识别（Just identified），其模型拟合不可测量、

2. 潜变量的度量（metric）如何确定？

潜变量毕竟是未测量的，其条目（显变量/观测变量）是实际上被测量的，因此需要设置：

要么，将潜变量的方差设置为1，所有条目对潜变量的效应反映其解释了多少潜变量的变异。

要么，将其中一个条目的因子载荷设置为1（如上图），这是该条目相当于被当做了参照标准，其他条目对潜变量的影响参照该条目，小于1，则不如该条目，大于1则大于该条目。

#### 在lavaan中定义验证性因素分析模型

In [1]:
library(lavaan)

"程辑包'lavaan'是用R版本4.2.3 来建造的"
This is lavaan 0.6-16
lavaan is FREE software! Please report any bugs.



In [2]:
# 载入数据
BASE <- read.csv("C:/Users/77387/Desktop/Data_analysis_courses/R语言与结构方程模型/data/data3_10.csv")

In [17]:
head(BASE)

Unnamed: 0_level_0,id,item1,item2,item3,item4,item5,X,X.1,X.2,X.3,...,X.30,X.31,X.32,X.33,X.34,X.35,X.36,X.37,X.38,X.39
Unnamed: 0_level_1,<int>,<int>,<int>,<int>,<int>,<int>,<lgl>,<lgl>,<lgl>,<lgl>,...,<lgl>,<lgl>,<lgl>,<lgl>,<lgl>,<lgl>,<lgl>,<lgl>,<lgl>,<lgl>
1,1,2,2,4,2,2,,,,,...,,,,,,,,,,
2,2,7,7,7,6,7,,,,,...,,,,,,,,,,
3,3,6,3,5,5,5,,,,,...,,,,,,,,,,
4,4,2,1,1,1,2,,,,,...,,,,,,,,,,
5,5,1,1,1,1,1,,,,,...,,,,,,,,,,
6,6,2,4,4,4,7,,,,,...,,,,,,,,,,


In [3]:
# 定义模型
model.SPE <- 'LS =~ item3 + item1 + item2 + item4 + item5' 

In [4]:
# 模型拟合
model.EST <-cfa (model.SPE, data = BASE, estimator = "MLR")

In [5]:
# 提取模型
summary (model.EST,
         fit.measures = TRUE, 
         standardized = TRUE,
         rsq = TRUE, 
         modindices = TRUE)

lhs,op,rhs,exo,est,se,z,pvalue,std.lv,std.all,std.nox
<chr>,<chr>,<chr>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
LS,=~,item3,0,1.0,0.0,,,1.2771581,0.7661992,0.7661992
LS,=~,item1,0,0.8589443,0.13115551,6.549052,5.790324e-11,1.0970077,0.5550862,0.5550862
LS,=~,item2,0,0.9912457,0.11458965,8.650395,0.0,1.2659775,0.6767968,0.6767968
LS,=~,item4,0,1.2264859,0.08477213,14.468031,0.0,1.5664164,0.8861091,0.8861091
LS,=~,item5,0,0.9533607,0.12148512,7.847552,4.218847e-15,1.2175924,0.6400659,0.6400659
item3,~~,item3,0,1.1473384,0.21810652,5.26045,1.437035e-07,1.1473384,0.4129387,0.4129387
item1,~~,item1,0,2.7022711,0.2745704,9.841815,0.0,2.7022711,0.6918793,0.6918793
item2,~~,item2,0,1.8962323,0.25608816,7.404607,1.314504e-13,1.8962323,0.5419461,0.5419461
item4,~~,item4,0,0.6712681,0.19577676,3.428743,0.0006063841,0.6712681,0.2148107,0.2148107
item5,~~,item5,0,2.1361841,0.2402733,8.890643,0.0,2.1361841,0.5903156,0.5903156

Unnamed: 0_level_0,lhs,op,rhs,mi,epc,sepc.lv,sepc.all,sepc.nox
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
12,item3,~~,item1,12.14866013,-0.47535243,-0.47535243,-0.26996378,-0.26996378
13,item3,~~,item2,0.08713499,0.0379449,0.0379449,0.02572539,0.02572539
14,item3,~~,item4,24.7470324,0.75859893,0.75859893,0.86440734,0.86440734
15,item3,~~,item5,7.67568928,-0.35986772,-0.35986772,-0.22986762,-0.22986762
16,item1,~~,item2,10.48543651,0.52189962,0.52189962,0.23055617,0.23055617
17,item1,~~,item4,2.88400414,-0.24797468,-0.24797468,-0.18411741,-0.18411741
18,item1,~~,item5,12.28871125,0.58852894,0.58852894,0.24495371,0.24495371
19,item2,~~,item4,9.53670707,-0.46030851,-0.46030851,-0.40799535,-0.40799535
20,item2,~~,item5,0.84085631,0.13686391,0.13686391,0.06800234,0.06800234
21,item4,~~,item5,0.10341261,-0.04704686,-0.04704686,-0.0392883,-0.0392883


In [6]:
# 计算可靠性系数（reliability coefficient）

library(semTools)

reliability (model.EST)

 

###############################################################################

This is semTools 0.5-6

All users of R (or SEM) are invited to submit functions or ideas for functions.

###############################################################################



Unnamed: 0,LS
alpha,0.8316982
omega,0.8283265
omega2,0.8283265
omega3,0.8159017
avevar,0.4946876


#### lavaa 中默认第一个条目的因子载荷固定为1，如何使其如上图一样固定为条目3呢？

In [7]:
model.SPE <- 'LS =~ NA*item3 + item1 + item2 + item4 + item5'

#### 如何获取参数估计的95%置信区间？

In [8]:
parameterEstimates (model.EST)

lhs,op,rhs,est,se,z,pvalue,ci.lower,ci.upper
<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
LS,=~,item3,1.0,0.0,,,1.0,1.0
LS,=~,item1,0.8589443,0.13115551,6.549052,5.790324e-11,0.6018842,1.116004
LS,=~,item2,0.9912457,0.11458965,8.650395,0.0,0.7666541,1.215837
LS,=~,item4,1.2264859,0.08477213,14.468031,0.0,1.0603355,1.392636
LS,=~,item5,0.9533607,0.12148512,7.847552,4.218847e-15,0.7152543,1.191467
item3,~~,item3,1.1473384,0.21810652,5.26045,1.437035e-07,0.7198575,1.574819
item1,~~,item1,2.7022711,0.2745704,9.841815,0.0,2.164123,3.240419
item2,~~,item2,1.8962323,0.25608816,7.404607,1.314504e-13,1.3943087,2.398156
item4,~~,item4,0.6712681,0.19577676,3.428743,0.0006063841,0.2875527,1.054984
item5,~~,item5,2.1361841,0.2402733,8.890643,0.0,1.665257,2.607111


#### 显变量的covariance matrix 

cov(BASE)

#### model-implied covariance matrix/ reproduced matrix

两显变量的reproduced covariance = 第一个显变量的完全标准化效应 * 第二个显变量的完全标准化效应

当然也可以用下面函数一步到位提出来：

In [9]:
fitted(model.EST)

Unnamed: 0,item3,item1,item2,item4,item5
item3,2.778471,1.401052,1.616853,2.000561,1.555058
item1,1.401052,3.905697,1.388787,1.718371,1.335708
item2,1.616853,1.388787,3.498931,1.983048,1.541445
item4,2.000561,1.718371,1.983048,3.124929,1.907257
item5,1.555058,1.335708,1.541445,1.907257,3.618715


In [18]:
cov(BASE[c("item1","item2","item3","item4","item5")],use="complete.obs")

Unnamed: 0,item1,item2,item3,item4,item5
item1,3.921135,1.801189,1.093399,1.641779,1.817404
item2,1.801189,3.51276,1.642993,1.886652,1.645903
item3,1.093399,1.642993,2.789456,2.107778,1.35458
item4,1.641779,1.886652,2.107778,3.137282,1.902337
item5,1.817404,1.645903,1.35458,1.902337,3.633018


####  To obtain the residual correlations matrix (S - Σ):

In [19]:
residuals (model.EST, type = "raw")

Unnamed: 0,item3,item1,item2,item4,item5
item3,2.249532e-06,-0.3119582,0.0196708,0.09891773,-0.2058114
item1,-0.3119582,8.054594e-07,0.4053105,-0.08305562,0.4745403
item2,0.0196708,0.4053105,-8.056263e-07,-0.1038242,0.09797818
item4,0.09891773,-0.08305562,-0.1038242,1.740255e-06,-0.01240894
item5,-0.2058114,0.4745403,0.09797818,-0.01240894,-6.112083e-07


### 当一测量模型包含两个因素——两因素验证性因素分析

举例，以下是乐观量表，包含两个相对独立的维度

![image.png](attachment:image.png)

从上面模型，我们可知，每一个条目可以被分解为：

![image-2.png](attachment:image-2.png)

#### 我们在lavaan中定义该模型

lavaan会自动设置潜变量之间covariance，

潜变量的covariance反映两潜变量的区分效度，一般情况下，区分效度过低（covariance过高）提醒最好把两个潜变量合并。在重测模型中不是这样。

In [22]:
#数据
BASE <- read.csv("C:/Users/77387/Desktop/Data_analysis_courses/R语言与结构方程模型/data/data3_11.csv")

model.SPE <- 'pathways =~ hop.p1 + hop.p3 + hop.p4 + hop.p5 
              agency =~ hop.a2 + hop.a6 + hop.a7 + hop.a8

'

#### 模型拟合

In [23]:
model.EST <- cfa (model.SPE, data = BASE, estimator = "MLR")

#### 提取模型结果

In [24]:
summary (model.EST, 
         fit.measures = TRUE, 
         standardized = TRUE, 
         modindices = TRUE)

lhs,op,rhs,exo,est,se,z,pvalue,std.lv,std.all,std.nox
<chr>,<chr>,<chr>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
pathways,=~,hop.p1,0,1.0,0.0,,,0.4666087,0.4707868,0.4707868
pathways,=~,hop.p3,0,2.4754451,0.70275937,3.522465,0.0004275539,1.1550643,0.6303454,0.6303454
pathways,=~,hop.p4,0,1.9759091,0.39999228,4.939868,7.817538e-07,0.9219764,0.6235902,0.6235902
pathways,=~,hop.p5,0,2.227085,0.65772916,3.386021,0.0007091386,1.0391773,0.633868,0.633868
agency,=~,hop.a2,0,1.0,0.0,,,1.5427511,0.8246905,0.8246905
agency,=~,hop.a6,0,0.7923199,0.07533972,10.516629,0.0,1.2223524,0.6185108,0.6185108
agency,=~,hop.a7,0,0.9410206,0.07177723,13.110295,0.0,1.4517606,0.7761177,0.7761177
agency,=~,hop.a8,0,0.6674817,0.07915367,8.432732,0.0,1.0297581,0.6177776,0.6177776
hop.p1,~~,hop.p1,0,0.7646057,0.13924973,5.490895,3.999015e-08,0.7646057,0.7783598,0.7783598
hop.p3,~~,hop.p3,0,2.023629,0.27052887,7.48027,7.41629e-14,2.023629,0.6026647,0.6026647

Unnamed: 0_level_0,lhs,op,rhs,mi,epc,sepc.lv,sepc.all,sepc.nox
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
20,pathways,=~,hop.a2,1.438449,0.473955843,0.221151932,0.118218609,0.118218609
21,pathways,=~,hop.a6,1.933333,0.596364296,0.278268784,0.140804109,0.140804109
22,pathways,=~,hop.a7,7.560645,-1.069876807,-0.499213853,-0.266881932,-0.266881932
23,pathways,=~,hop.a8,0.3219141,0.205321696,0.095804895,0.057475746,0.057475746
24,agency,=~,hop.p1,1.840617,-0.094838407,-0.146312059,-0.147622159,-0.147622159
25,agency,=~,hop.p3,2.679713,0.223860024,0.345360305,0.188471134,0.188471134
26,agency,=~,hop.p4,1.921329,-0.152138322,-0.234711568,-0.158750087,-0.158750087
27,agency,=~,hop.p5,0.6584658,0.099563299,0.153601392,0.093692396,0.093692396
28,hop.p1,~~,hop.p3,1.233187,-0.111574349,-0.111574349,-0.089697461,-0.089697461
29,hop.p1,~~,hop.p4,17.25995,0.33636536,0.33636536,0.332815329,0.332815329
