Skip to content

Notes on I‐GSCA Implementation

emstruong edited this page May 13, 2024 · 38 revisions

These are developer notes on how I-GSCA is implemented in the cSEM notes. It is intended to serve as both a refresher and reference to cSEM developers.

Implementation of cSEM::igsca()

Flow Chart

At a high level and omitting both (1) some mathematical details and (2) accessory steps to prepare data for computation:

flowchart TD
    A["`Prepare data using 
_cSEM::extract_parseModel()_`"]
    B["`Run igsca using 
_cSEM::igsca()_`"]
    C["`Prepare for ALS algorithm using 
_prepare_for_ALS()_`"] 
    D["`Alternating Least Squares Algorithm`"] 
    E["`Flip the signs of the **C**, **B** and **Gamma** using 
_flip_signs_ind_domi()_`"]
    Exit[(Exit)]
    A --> B --> C --> D --> E --> Exit
Loading
  • C stands for loadings matrix
  • B stands for structural-coefficients matrix
  • Gamma stands for the construct scores (i.e., factor scores and composite scores), as computed from the matrix multiplication of the indicators (Z) and weights (W) matrix.

prepare_for_ALS()

flowchart TD
    init["`Provides initial estimates of W, C and B using **GSCA** as implemented in
_gsca_inione()_`"]
    init2["`Initialize:
(1) **V**
(2) **Z** as standardized indicators matrix
(3) **Gamma** as matrix multiplication of **Z** and **W**
(4) **DU** through SVD`"]
    Exit[(Exit)]
    init --> init2 --> Exit
Loading
  • DU corresponds to the uniqueness terms. (U is capitalised because lower-case u corresponds to SVD output and not the error/uniqueness terms we are interested in.)

Alternating Least Squares Algorithm

  • Weights are done in one function
  • Loadings are done in a for loop sequentially for each construct variable
flowchart TD

    ChangeEps["`Evaluate if both are true:
(1) The sum of the absolute changes in the parameter estimates since the last iterations is more than _ceps_
(2) If the number of elapsed iterations is less than or equal to _itmax_`"]
    pseudoWeights["`Create pseudo-weights **X** and **WW** using 
_update_X_weights()_`"]
    ConstructUpdate["`Using a _for_ loop that iterates through each construct variable use:
(A) update_composite_LV() to update **W**, **Gamma** and **V** using **X**
(B) update_factor_LV() to update **W**, **Gamma** and **V** using **WW**
`"] 
    LoadingsFactors["`Update **C**, **B**, **D**, **uniqueD**, list of parameter estimates and **U** using
_update_C_B_D()_`"]
    Exit[(Exit)]


    Start --> ChangeEps -->|TRUE|pseudoWeights --> ConstructUpdate --> LoadingsFactors --> ChangeEps
    ChangeEps -- FALSE ----> Exit
Loading
  • X and WW is what I call 'pseudo-weights' because they are not actually weights, but they're used to estimate the weights of used to compute the construct scores from the indicators, depending on whether the indicator corresponds to a composite variable (X) or latent factor (WW)
  • D is the matrix of error terms associated with each latent variable onto its indicators (Technically Du is.)
  • V includes s, which includes Du (meaning, DU)

Current Limitations

Suspected

  • Unable to have cross-loadings between one indicator and multiple construct variables
  • Construct variables and indicators cannot be regressed onto each other in the structural model (outside of the measurement model)

Confirmed

  • Indicators can only correspond to either a composite or a common factor variable, but not both. One reason why is because of how extract_parseModel() currently works.

Development Cycle

To ensure correctness, test-igsca.R can be continuously sourced in-order to ensure that any modifications have not made cSEM::igsca() diverge markedly from the other implementations.

Comparisons with Other Implementations

The implementation of IGSCA in cSEM was compared against GSCAPro Version 1.2.1 and a Matlab version kindly sent by Dr. Heungsun Hwang.

GSCAPro V1.2.1

The .csv files for the output of GSCAPro are stored in tests/comparisons/igsca_translation/GSCAPro_1_2_1Output.

These .csv files were formatted for comparison using cSEM::get_lavaan_table_igsca_gscapro()

The .RData for comparing between GSCAPro and csem::igsca() are found in tests/data/igsca_gscapro.RData/

Matlab: igsca_sim_test.m

igsca_sim_test.m is a modified version of the original igsca_sim.m sent by Dr. Hwang. The modifications were made to facilitate the repeated execution of the exemplary model.

The .csv files for the input to igsca_sim.m are stored in tests/comparisons/igsca_translation/matlab_in. The .csv files were generated from write_for_matlab().

The results of igsca_sim_test.m can be read into R using R.matlab::readMat("FILEDIRECTORY/FILENAME.MAT"). Then, the individual matrices can be converted into a summary table for test comparison using cSEM::get_lavaan_table_igsca_matrix(). This summary table was converted into a .RData file called igsca_matlab.RData.

The .RData for comparing between Matlab and cSEM::igsca() are found in tests/data/igsca_matlab.RData/

Terminological Differences

Here, we adopt the terminology that:

  • Latent variables, common factors, factors, latent common factors are all the same concept
  • Component variables, composite variables, weighted sum scores are all the same concept
  • Latent and component variables are sub-sets of the more general class of construct variables.

Clone this wiki locally