# T&D Comprehensive Project Analyses

Load Libraries

In [29]:
library(tidyverse)
library(psych)
library(lavaan)

## Data Import and Cleaning
Read in survey data (data.csv) and open ended scores from rubric scoring (openended scoring.csv). 

Clean data (keep only actual cases and relevant variables.

Score MC & open-ended items.

In [30]:
#read in survey data
data<-read.csv("data.csv",stringsAsFactors=FALSE)

#read in openended response scores
openendedscores<- read.csv("openended scoring.csv")

data<-data[-c(1:6),] #remove invalid cases

# score MC questions
mc_questions <- data[,18:30] #subset MC vars
mc_questions <- mc_questions %>%
  mutate_if(is.character, as.numeric) #convert to proper variable type
keys <- c(2,5,2,5,6,2,5,2,2,4,6,1,2) #answer key
scores<-score.multiple.choice(keys,mc_questions,score=FALSE)
scores<-as.data.frame(scores)

# create scale scores for openended questions
openended<-select(openendedscores, Score, Score.1, Score.2, Score.3, Score.4)
keys_opend <- list(openended = c("Score", "Score.1", "Score.2", "Score.3", "Score.4"))
scored_openended<-scoreItems(keys_opend, openended)



## Descriptives

Descriptives for knowledge test scores

In [31]:
describe(scores)

Unnamed: 0,vars,n,mean,sd,median,trimmed,mad,min,max,range,skew,kurtosis,se
Q1,1,7,0.8571429,0.3779645,1,0.8571429,0,0,1,1,-1.6198477,0.7959184,0.1428571
Q2,2,7,0.5714286,0.5345225,1,0.5714286,0,0,1,1,-0.2290811,-2.2040816,0.2020305
Q3,3,7,1.0,0.0,1,1.0,0,1,1,0,,,0.0
Q4,4,7,1.0,0.0,1,1.0,0,1,1,0,,,0.0
Q5,5,7,0.1428571,0.3779645,0,0.1428571,0,0,1,1,1.6198477,0.7959184,0.1428571
Q6,6,7,1.0,0.0,1,1.0,0,1,1,0,,,0.0
Q7,7,7,0.7142857,0.48795,1,0.7142857,0,0,1,1,-0.7528372,-1.6040816,0.1844278
Q8,8,7,0.8571429,0.3779645,1,0.8571429,0,0,1,1,-1.6198477,0.7959184,0.1428571
Q9,9,7,1.0,0.0,1,1.0,0,1,1,0,,,0.0
Q10,10,7,0.8571429,0.3779645,1,0.8571429,0,0,1,1,-1.6198477,0.7959184,0.1428571


Descriptives for openended scores (individual)

In [32]:
describe(openended)

Unnamed: 0,vars,n,mean,sd,median,trimmed,mad,min,max,range,skew,kurtosis,se
Score,1,7,3.571429,0.9759001,4,3.571429,1.4826,2,5,3,-0.1693884,-1.334082,0.3688556
Score.1,2,7,3.857143,1.069045,3,3.857143,0.0,3,5,2,0.2290811,-2.204082,0.404061
Score.2,3,7,3.714286,0.7559289,4,3.714286,1.4826,3,5,2,0.3644657,-1.454082,0.2857143
Score.3,4,7,3.142857,1.2149858,3,3.142857,1.4826,2,5,3,0.2535811,-1.813863,0.4592215
Score.4,5,7,2.714286,1.1126973,3,2.714286,1.4826,1,4,3,-0.1523727,-1.636034,0.42056


Descriptives for openended scores (scale)

In [33]:
describe(scored_openended$scores)

Unnamed: 0,vars,n,mean,sd,median,trimmed,mad,min,max,range,skew,kurtosis,se
X1,1,7,3.4,0.4898979,3.6,3.4,0.59304,2.6,4,1.4,-0.3499271,-1.52381,0.185164


## Reliability

In [34]:
#LO 1
L01 <- scores %>%
  dplyr::select(Q1, Q2,Q3, Q4, Q5,Q6) 
psych::alpha(L01)

#with dropped items
L01 <- scores %>%
  dplyr::select(Q1, Q5) 
psych::alpha(L01)


"Some items were negatively correlated with the total scale and probably 
should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option"

Some items ( Q2 ) were negatively correlated with the total scale and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option

"NaNs produced"


Reliability analysis   
Call: psych::alpha(x = L01)

  raw_alpha std.alpha G6(smc) average_r   S/N ase mean   sd median_r
      -1.5      -1.2    -0.4     -0.22 -0.54 1.7 0.52 0.18    -0.35

 lower alpha upper     95% confidence boundaries
-4.82 -1.5 1.82 

 Reliability if an item is dropped:
   raw_alpha std.alpha G6(smc) average_r   S/N alpha se var.r med.r
Q1     -1.60     -1.78   -0.47     -0.47 -0.64     1.82    NA -0.47
Q2      0.29      0.29    0.17      0.17  0.40     0.54    NA  0.17
Q5     -1.00     -1.09   -0.35     -0.35 -0.52     1.41    NA -0.35

 Item statistics 
   n raw.r std.r r.cor r.drop mean   sd
Q1 7  0.47  0.63   NaN  -0.26 0.86 0.38
Q2 7  0.42  0.13   NaN  -0.54 0.57 0.53
Q5 7  0.35  0.54   NaN  -0.35 0.14 0.38

Non missing response frequency for each item
      0    1 miss
Q1 0.14 0.86    0
Q2 0.43 0.57    0
Q5 0.86 0.14    0

"data length [16] is not a sub-multiple or multiple of the number of columns [10]"


Reliability analysis   
Call: psych::alpha(x = L01)

  raw_alpha std.alpha G6(smc) average_r S/N  ase mean   sd median_r
      0.29      0.29    0.17      0.17 0.4 0.54  0.5 0.29     0.17

 lower alpha upper     95% confidence boundaries
-0.77 0.29 1.34 

 Reliability if an item is dropped:
   raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
Q1     0.167      0.17   0.028      0.17  NA       NA 0.167  0.17
Q5     0.028      0.17      NA        NA  NA       NA 0.028  0.17

 Item statistics 
   n raw.r std.r r.cor r.drop mean   sd
Q1 7  0.76  0.76  0.31   0.17 0.86 0.38
Q5 7  0.76  0.76  0.31   0.17 0.14 0.38

Non missing response frequency for each item
      0    1 miss
Q1 0.14 0.86    0
Q5 0.86 0.14    0

In [35]:
#LO 2
L02 <- scores %>%
  dplyr::select(Q7, Q8,Q9, Q10, Q11,Q12, Q13) 
psych::alpha(L02)

#with dropped items
L02 <- scores %>%
  dplyr::select(Q7, Q8, Q10, Q13) 
psych::alpha(L02)


"Some items were negatively correlated with the total scale and probably 
should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option"

Some items ( Q10 ) were negatively correlated with the total scale and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option


Reliability analysis   
Call: psych::alpha(x = L02)

  raw_alpha std.alpha G6(smc) average_r S/N  ase mean   sd median_r
      0.52      0.51    0.71      0.21 1.1 0.29 0.75 0.29     0.28

 lower alpha upper     95% confidence boundaries
-0.05 0.52 1.1 

 Reliability if an item is dropped:
    raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r  med.r
Q7       0.55      0.51    0.64     0.259 1.05     0.26  0.14  0.471
Q8       0.26      0.25    0.37     0.101 0.34     0.47  0.13  0.091
Q10      0.62      0.67    0.70     0.403 2.02     0.26  0.08  0.471
Q13      0.23      0.19    0.40     0.074 0.24     0.48  0.25 -0.167

 Item statistics 
    n raw.r std.r r.cor r.drop mean   sd
Q7  7  0.59  0.58  0.44  0.205 0.71 0.49
Q8  7  0.76  0.76  0.75  0.560 0.86 0.38
Q10 7  0.38  0.41  0.22  0.059 0.86 0.38
Q13 7  0.81  0.80  0.74  0.510 0.57 0.53

Non missing response frequency for each item
       0    1 miss
Q7  0.29 0.71    0
Q8  0.14 0.86    0
Q10 0.14 0.86    0
Q13 0.43 0.57    

"Some items were negatively correlated with the total scale and probably 
should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option"

Some items ( Q10 ) were negatively correlated with the total scale and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option


Reliability analysis   
Call: psych::alpha(x = L02)

  raw_alpha std.alpha G6(smc) average_r S/N  ase mean   sd median_r
      0.52      0.51    0.71      0.21 1.1 0.29 0.75 0.29     0.28

 lower alpha upper     95% confidence boundaries
-0.05 0.52 1.1 

 Reliability if an item is dropped:
    raw_alpha std.alpha G6(smc) average_r  S/N alpha se var.r  med.r
Q7       0.55      0.51    0.64     0.259 1.05     0.26  0.14  0.471
Q8       0.26      0.25    0.37     0.101 0.34     0.47  0.13  0.091
Q10      0.62      0.67    0.70     0.403 2.02     0.26  0.08  0.471
Q13      0.23      0.19    0.40     0.074 0.24     0.48  0.25 -0.167

 Item statistics 
    n raw.r std.r r.cor r.drop mean   sd
Q7  7  0.59  0.58  0.44  0.205 0.71 0.49
Q8  7  0.76  0.76  0.75  0.560 0.86 0.38
Q10 7  0.38  0.41  0.22  0.059 0.86 0.38
Q13 7  0.81  0.80  0.74  0.510 0.57 0.53

Non missing response frequency for each item
       0    1 miss
Q7  0.29 0.71    0
Q8  0.14 0.86    0
Q10 0.14 0.86    0
Q13 0.43 0.57    

In [36]:
#LO 3
alpha(openended)

"Some items were negatively correlated with the total scale and probably 
should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option"

Some items ( Score Score.4 ) were negatively correlated with the total scale and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option


Reliability analysis   
Call: alpha(x = openended)

  raw_alpha std.alpha G6(smc) average_r  S/N  ase mean   sd median_r
      0.13      0.15    0.91     0.034 0.17 0.54  3.4 0.49    0.063

 lower alpha upper     95% confidence boundaries
-0.93 0.13 1.19 

 Reliability if an item is dropped:
        raw_alpha std.alpha G6(smc) average_r    S/N alpha se var.r  med.r
Score      -0.298    -0.348    0.74    -0.069 -0.258     0.80  0.18 -0.039
Score.1     0.071     0.057    0.34     0.015  0.061     0.55  0.11 -0.023
Score.2     0.182     0.190    0.72     0.055  0.234     0.53  0.15  0.063
Score.3     0.170     0.254    0.57     0.079  0.341     0.54  0.15  0.175
Score.4     0.279     0.281    0.58     0.089  0.390     0.44  0.11  0.175

 Item statistics 
        n raw.r std.r r.cor  r.drop mean   sd
Score   7  0.70  0.74  0.68  0.3847  3.6 0.98
Score.1 7  0.51  0.52  0.53  0.0842  3.9 1.07
Score.2 7  0.27  0.42  0.39 -0.0400  3.7 0.76
Score.3 7  0.50  0.36  0.35  0.0093  3.1 1.21
Score.4

## Validity evidence: confirmatory factor analysis 

In [37]:
# LO 1
#droped 3,4,6, due to no var
L01 <- scores %>%
  dplyr::select(Q1, Q5) %>%
  mutate_all(ordered)

model<-
  cfa<-cfa("LO1 =~ Q1 +   Q5", data=L01)
summary(cfa, fit.measures=TRUE )


    Could not compute standard errors! The information matrix could
    not be inverted. This may be a symptom that the model is not
    identified."

lavaan 0.6-3 ended normally after 5 iterations

  Optimization method                           NLMINB
  Number of free parameters                          4

  Number of observations                             7

  Estimator                                       DWLS      Robust
  Model Fit Test Statistic                          NA          NA
  Degrees of freedom                                -1          -1
  Minimum Function Value               0.0000000000000
  Scaling correction factor                           
  Shift parameter                                     
    for simple second-order correction (Mplus variant)

User model versus baseline model:

  Comparative Fit Index (CFI)                       NA          NA
  Tucker-Lewis Index (TLI)                          NA          NA

  Robust Comparative Fit Index (CFI)                            NA
  Robust Tucker-Lewis Index (TLI)                               NA

Root Mean Square Error of Approximation:

  RMSEA         

In [38]:
# LO 2
#droped 9,11,12, due to no var
L02 <- scores %>%
  dplyr::select(Q7, Q8,Q9, Q10, Q11,Q12, Q13) %>%
  mutate_all(ordered)

model<-
  cfa<-cfa("LO2 =~ Q7+ Q8+ Q10+ Q13", data=L02)
summary(cfa, fit.measures=TRUE )


    The variance-covariance matrix of the estimated parameters (vcov)
    does not appear to be positive definite! The smallest eigenvalue
    (= -1.836953e-15) is smaller than zero. This may be a symptom that
    the model is not identified."

lavaan 0.6-3 ended normally after 20 iterations

  Optimization method                           NLMINB
  Number of free parameters                          8

  Number of observations                             7

  Estimator                                       DWLS      Robust
  Model Fit Test Statistic                       0.017       0.409
  Degrees of freedom                                 2           2
  P-value (Chi-square)                           0.991       0.815
  Scaling correction factor                                  0.736
  Shift parameter                                            0.385
    for simple second-order correction (Mplus variant)

Model test baseline model:

  Minimum Function Test Statistic                9.770       8.606
  Degrees of freedom                                 6           6
  P-value                                        0.135       0.197

User model versus baseline model:

  Comparative Fit Index (CFI)                    1.000       

In [39]:
#LO 3
LO3 <- openended
model<-
  cfa<-cfa("LO3 =~ Score + Score.1 + Score.2 + Score.3 + Score.4", data=LO3)
summary(cfa, fit.measures=TRUE )

    Could not compute standard errors! The information matrix could
    not be inverted. This may be a symptom that the model is not
                  the optimizer may not have found a local solution;
                  use lavInspect(fit, "optim.gradient") to investigate"

lavaan 0.6-3 ended normally after 3334 iterations

  Optimization method                           NLMINB
  Number of free parameters                         10

  Number of observations                             7

  Estimator                                         ML
  Model Fit Test Statistic                       8.328
  Degrees of freedom                                 5
  P-value (Chi-square)                           0.139

Model test baseline model:

  Minimum Function Test Statistic               27.148
  Degrees of freedom                                10
  P-value                                        0.002

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.806
  Tucker-Lewis Index (TLI)                       0.612

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)                -38.004
  Loglikelihood unrestricted model (H1)        -33.840

  Number of free parameters                         10
  Akaike (AIC)

## Item Level Correlation Matrix

In [40]:
cbind(scores, openended) %>%
  cor()

"the standard deviation is zero"

Unnamed: 0,Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Score,Score.1,Score.2,Score.3,Score.4
Q1,1.0,-0.3535534,,,0.1666667,,-0.25819889,-0.1666667,,1.0,,,0.47140452,-0.19364917,-0.47140452,-0.1666667,-0.31108551,-0.1132277
Q2,-0.3535534,1.0,,,-0.4714045,,0.73029674,0.4714045,,-0.3535534,,,-0.16666667,0.22821773,0.75,0.4714045,0.36661779,-0.24019223
Q3,,,1.0,,,,,,,,,,,,,,,
Q4,,,,1.0,,,,,,,,,,,,,,
Q5,0.1666667,-0.4714045,,,1.0,,-0.64549722,0.1666667,,0.1666667,,,0.35355339,-0.25819889,-0.35355339,-0.4166667,-0.41478068,-0.28306926
Q6,,,,,,1.0,,,,,,,,,,,,
Q7,-0.2581989,0.7302967,,,-0.6454972,,1.0,0.6454972,,-0.2581989,,,0.09128709,0.75,0.54772256,0.6454972,0.08032193,0.1315587
Q8,-0.1666667,0.4714045,,,0.1666667,,0.64549722,1.0,,-0.1666667,,,0.47140452,0.71004695,0.35355339,0.4166667,-0.31108551,-0.1132277
Q9,,,,,,,,,1.0,,,,,,,,,
Q10,1.0,-0.3535534,,,0.1666667,,-0.25819889,-0.1666667,,1.0,,,0.47140452,-0.19364917,-0.47140452,-0.1666667,-0.31108551,-0.1132277
