Skip to content
Terrence edited this page Oct 26, 2020 · 53 revisions

Contents

Introduction

simsem provides three methods for running a simulation. First, simsem provides a model specification framework by matrices. The framework is designed to have additional features that current SEM packages do not have (e.g., random parameters, sophisticated model misspecification, nonnormal factor distribution, and fixed covariate data). Second, simsem can take lavaan syntax in data generation and data analysis in running a simulation. This approach is easier for those who are familiar with lavaan. Third, simsem can use the OpenMx model to generate data based on starting values. Here are the advantages and disadvantages of all approaches:

The advantages of simsem matrix style

  • Generate data based on standardized parameters
  • Data generation with random parameters
  • Data generation with model misspecification
  • Control the order of 1) finding unspecified parameters (e.g., find residual variances when total variances are specified), 2) imposing equality/nonlinear constraints, and 3) imposing model misspecification
  • Sequential method for data generation (generate data at the factor level and use them to create indicator data)
  • Nonnormal factor distribution and nonnormal error distribution
  • Create data based on exogenous covariates
  • Bollen-Stine bootstrap
  • Slightly faster

The advantages of the lavaan syntax style

  • Generate data based on standardized parameters (see version 0.5-13 for the full support)
  • Syntax-based input for both data generation and data analysis
  • Create endogenous ordered categorical variables

The advantages of the OpenMx object

  • Create endogenous ordered categorical variables
  • Simulate data based on definition variables
  • Mixture model

Common features for all styles

  • Parallel processing
  • Nonnormal indicator distribution (by copula or Vale and Maurelli's method)
  • Impose missing data (MCAR, MAR, or planned missing data)
  • Simulation with different samples or percent missing across replications
  • Nonlinear constraints and defined parameters
  • Generate data from lavaan output
  • Multiple imputation
  • Modeling auxiliary variables
  • Power analysis of the significance of parameter estimates and power analysis in rejecting bad models using absolute model fit, nested model comparison, or nonnested model comparison, accuracy in parameter estimation, coverage rate of confidence intervals.
  • Transform generated data and extract additional outputs
  • Run a simulation based on a population data set or a list of sample data sets
  • Run a simulation until the specified number of convergent replications is obtained
  • Users can write a function that returns a vector of parameter estimates, standard errors, fit indices, and convergence status and use the function in analyzing generated data, which will be automatically saved in the simulation result.

User can generate data using one format and analyze data by a different format. See examples for single group analysis and multiple group analysis.

This vignette is organized in examples. New users are recommended to read (or skim) at the starred examples. The first part is the list of examples using matrix specification. The second part is the list of examples using lavaan syntax. The third part is the list of examples using OpenMx. The last part is the list of examples for simsem version 0.2 (or lower).

simsem matrix specification (version 0.3 or higher)

lavaan syntax specification (version 0.5 or higher)

(The number listed follows the order of examples in the simsem matrix specification)

OpenMx model specification (version 0.5 or higher)

(The number listed follows the order of examples in the simsem matrix specification)

simsemClassic or simsem version 0.2-*

Clone this wiki locally