```In this notebook I will discover how to perform a Stractural Equation Modeling```

- Ayoub El Majjodi
- copyright@22

- Stractural Equation models are models that explain relationship between __measured__ __variables__ and __latent__ __variable__ and relationship __between latent __variables__.

- A great example of a latent variable that cannot really be measured directly is Intelligence. We have plenty of school exams, IQ tests, psych tests to measure a concept like intelligence, but they always come down to

### When to use Structural Equation Modeling 

- If you want to use SEM, you must try to identify the underlying concept that is important but not measurable.
- SEM applies mostly as a confirmation and testing method.
- SEM are useful, they help to learn the relationship between different variables of a phenomena.
- SEM give you estimates of coefficients based on the hypothesized relationships between variables.

### Types of SEM

- Confirmatory Factor Analysis
- Confirmatory Composite Analysis
- Path Analysis
- Partial Least Squares Path Modeling
- Latent Growth modeling

### Example

- The Example will be about performance at work. The most important variable of this project is __Job____Performance__. 
- The __Job____Performance__ is estimated based on three measured variables:
    -  ClientSat: A satisfaction rating between 1 and 100.
    -  SuperSat:  A rating on job performance between 1 and 100.
    -  PeojectComp: The percentage of your project that was successfully.
 - Hypothesis: Job performance is strongly impacted by three other latent variables: **Social Skills, Intellectual Skills and motivation**
 - Latent Social skills:PsychTest 1, PsychTest 2
 - Motivation: HrsTrain, HrsWrk.

 - The structural Equation Model often follows some general practices:
    - Latent Variables are denote by a circle
    - Measured Variables are denoted by a square
    - Relationships are denoted by arrows.
    - Variances and residuals are denoted by arrows from a variable to itself

<img src="./SEM.png">

The goal of structural Equation modeling techniques is to estimate coefficients for each of the arrows in your diagram.

## Applying SEM in R

In [12]:
data <- read.csv('https://articledatas3.s3.eu-central-1.amazonaws.com/StructuralEquationModelingData.csv')
names(data)

### Lavaan is a great package of Structural Equation Modeling: it is well documented, easy-to-use and coherent with the syntax of other R packages.

In [4]:
library(lavaan)

- Describe the model that we want to fit.  It should represent the architecture that we have drwan in the diagram before.
- There are three different types of relations that we can specify:
    - `=~`: symbol for definition of a latent variable as follows:  a_latent_variable =~ measured_var_1 + measured_var_2
    - `~`: symbol for a regression from one latent variable latent_variable_1 ~ latent_variable_2 + latent_variable_3
    - `~~`: measured_var_1 ~~ measured_var_2

In [16]:
## Create the model
model <- '
    # measurment model
    
    JobPref =~ ClientSat + SuperSat + ProjCompl
    Social =~ PsychTest1 + PsychTest2
    Intellect =~ YrsEdu + IQ
    Motivation =~ HrsTrn + HrsWrk
    
    # regressions
    JobPref ~ Social + Intellect + Motivation
'

In [18]:
## 

fit <- sem(model, data = data)
summary(fit, standardized = TRUE)



lavaan 0.6-11 ended normally after 269 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        24
                                                      
  Number of observations                          1000
                                                      
Model Test User Model:
                                                      
  Test statistic                              3321.607
  Degrees of freedom                                21
  P-value (Chi-square)                           0.000

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  JobPref =~                                                            
    ClientSat         1

#### The Heywood case
The Heywood case is relatively common in Structural Equation Modeling when the variance is negative.
The common cause for the Heywood case is having too strongly corrleated variables.

#### Chi-square Test of the overall SEM
This test gives you a p-value that shows you whther your model seems to be explaining an important enough part of the variation in the data.

#### RC and SEM
How three laten variables Social, Intellect, and Motivation impact Job Performance

As we have p-value of the independent variables are less that 0.05, we can conclude that the independent latent variabltes is shown to have an impact on Job Performance.

We will look at how much each of the latent independent variables influences Job Performance.

### Looking at the estimate 
    - Motivation has the highest coefficient (2.57), meaning that a change in Motivation will have the largest impact on Job Performance.
    - Intellect has the second-highet coefficient 0.72, meaning that intellect has the second highest impact on Job Performance.
    - Social skills come in last with a coefficinet of 0.32. Social skills still have an impact on Job Performance, it is just less than Mitvation and intellect.
    
### Key Take-Aways
The statistical indicators have shown that the hypothesized model is a good fit and you finish by concluding that motivation is most important for job Performance, followed in order by intellectual skills and social skills.